Upload folder to S3 recursively
03 Dec 2013Lately, I needed to upload a folder to S3 with all of it’s files.
The use-case is compiling assets on the CI and then uploading it to S3 for the CDN to consume.
While searching for a gem that does it I encountered s3_uploader, but I really didn’t like it because it’s using Fog.
Generally, I don’t like gems that use other gems for no apparent reason, there’s absolutely no reason to include fog in my project just to upload files recursively.
I did however, like that it’s using multi threads in order to do the upload so I am doing the same in my solution.
I wrote a solution that uses the aws-s3 Ruby SDK, which was already included in my project anyway.
Here’s the code:
class S3FolderUpload
attr_reader :folder_path, :total_files, :s3_bucket
attr_accessor :files
# Initialize the upload class
#
# folder_path - path to the folder that you want to upload
# bucket - The bucket you want to upload to
# aws_key - Your key generated by AWS defaults to the environemt setting AWS_KEY_ID
# aws_secret - The secret generated by AWS
#
# Examples
# => uploader = S3FolderUpload.new("some_route/test_folder", 'your_bucket_name')
#
def initialize(folder_path, bucket, aws_key = ENV['AWS_KEY_ID'], aws_secret = ENV['AWS_SECRET'])
@folder_path = folder_path
@files = Dir.glob("#{folder_path}/**/*")
@total_files = files.length
@connection = AWS::S3.new(access_key_id: aws_key, secret_access_key: aws_secret)
@s3_bucket = @connection.buckets[bucket]
end
# public: Upload files from the folder to S3
#
# thread_count - How many threads you want to use (defaults to 5)
#
# Examples
# => uploader.upload!(20)
# true
# => uploader.upload!
# true
#
# Returns true when finished the process
def upload!(thread_count = 5)
file_number = 0
mutex = Mutex.new
threads = []
thread_count.times do |i|
threads[i] = Thread.new {
until files.empty?
mutex.synchronize do
file_number += 1
Thread.current["file_number"] = file_number
end
file = files.pop rescue nil
next unless file
# I had some more manipulation here figuring out the git sha
# For the sake of the example, we'll leave it simple
#
path = file
puts "[#{Thread.current["file_number"]}/#{total_files}] uploading..."
data = File.open(file)
next if File.directory?(data)
obj = s3_bucket.objects[path]
obj.write(data, { acl: :public_read })
end
}
end
threads.each { |t| t.join }
end
end
The usage is really simple
uploader = S3FolderUpload.new('folder_name', 'your_bucket', aws_key, aws_secret)
uploader.upload!
Since it’s using Threads, the upload is really fast, both from local machines and from servers.
Have fun coding!