I've been hunting around and can't seem to find a good solution for this. My Rails app stores it's files in Amazon S3. I now need to send them to a remote (3rd party) service.
I'm using RestClient to post to the 3rd party server like this:
send_file = RestClient::Request.execute(
:method => :post,
:url => "http://remote-server-url.com",
:payload => File.new("some_local_file.avi", 'rb'),
:multipart => true,
etc.... )
It works for local files, but how can I send a remote file from S3 directly to this 3rd party service?
I found an answer here where someone was using open-uri: ruby reading files from S3 with open-URI
I tested that for myself, and it worked.
:payload => open(URI.parse("http://amazon-s3-example.com/some_file.avi"))
But, I've read a comment here that says open-uri simply loads the remote file into memory. See last comment on this answer: https://stackoverflow.com/a/264239/2785592
This wouldn't be ideal, as I'm handling potentially large video files. I've also read somewhere the RestClient loads even local files into memory; again, this isn't ideal. Does anyone know if that's true?
Surely I can't be the only one that has this problem. I know I could download the S3 file locally before sending it, but I was hoping to save on time & bandwidth. Also, if RestClient truly does load even local files to memory, than downloading it locally doesn't save me anything. Heh heh.
Any advice would be much appreciated. Thanks :)
Update: The remote server is just an API that responds to post requests. I don't have the ability to change anything on their end.
Take a look at: https://github.com/rest-client/rest-client/blob/master/lib/restclient/payload.rb
RestClient definitely supports streamed uploads. The condition is that in payload you pass something that is not a string or a hash, and that something you pass in responds to read and size. (so basically a stream).
On the S3 side, you basically need to grab a stream, not read the whole object before sending it. You use http://docs.aws.amazon.com/sdkforruby/api/Aws/S3/Client.html#get_object-instance_method and you say you want to get an IO object in the response target (not a string). For this purpose you may use an IO.pipe
reader, writer = IO.pipe
fork do
reader.close
s3.get_object(bucket: 'bucket-name', key: 'object-key') do |chunk|
writer.write(chunk)
end
end
writer.close
you pass in the reader to the RestClient::Payload.generate and use that as your payload. If the reading part is slower than the writing part you may still read a lot in memory. you want, when writing to only do accept the amount you are willing to buffer in memory. You can read the size of the stream with writer.stat.size (inside the fork) and spin on it once it gets past a certain size.