Environment:
I am trying to download a ~880MB blob from a container. When I do, it throws the following error after the Ruby process hits ~500MB in size:
C:/opscode/chefdk/embedded/lib/ruby/2.1.0/net/protocol.rb:102:in `read': failed to allocate memory (NoMemoryError)
I have tried this both inside and outside of Ruby, and with both the Azure gem and the Azure-Storage gem. The result is the same with all four combinations (Azure in Chef, Azure in Ruby, Azure-Storage in Chef, Azure-Storage in Ruby).
Most of the troubleshooting I have found for these kinds of problems suggests streaming or chunking the download, but there does not appear to be a corresponding method or get_blob option to do so.
Code:
require 'azure/storage'
# vars
account_name = "myacct"
container_name = "myfiles"
access_key = "mykey"
installs_dir = "myinstalls"
# directory for files
create_dir = 'c:/' + installs_dir
Dir.mkdir(create_dir) unless File.exists?(create_dir)
# create azure client
Azure::Storage.setup(:storage_account_name => account_name, :storage_access_key => access_key)
azBlobs = Azure::Storage::Blob::BlobService.new
# get list of blobs in container
dlBlobs = azBlobs.list_blobs(container_name)
# download each blob to directory
dlBlobs.each do |dlBlob|
puts "Downloading " + container_name + "/" + dlBlob.name
portalBlob, blobContent = azBlobs.get_blob(container_name, dlBlob.name)
File.open("c:/" + installs_dir + "/" + portalBlob.name, "wb") {|f|
f.write(blobContent)
}
end
I also tried using IO.binwrite() instead of File.open() and got the same result.
Suggestions?
As @coderanger said, your issue was caused by using get_blob
to local data into memory at once. There are two ways for resolving it.
The maximum size for a block blob created via Put Blob is 256 MB for version 2016-05-31 and later, and 64 MB for older versions. If your blob is larger than 256 MB for version 2016-05-31 and later, or 64 MB for older versions, you must upload it as a set of blocks. For more information, see the Put Block and Put Block Listoperations. It's not necessary to also call Put Blob if you upload the blob as a set of blocks.
So for a blob which consist of block blobs, you can try to get the block blob list via list_blob_blocks
to write these block blobs one by one to a local file.
signed_uri
like this test code, then to download the blob via streaming to write a local file.