rest azure azure-storage azure-blob-storage azure-web-roles

How to optimize large file transactions using a WCF Service as an endpoint for the Azure storage?

I have an WCF REST Service as an endpoint for the Azure Storage. The WCF REST Services handles uploads and downloads of files that usually measure 5-10 MB. When handling the stream (both for download an upload) the bytes are in the Azure VM RAM memory, right? Even if for upload the data is splitted into 4 MB blocks, those 4 MB are kept in the RAM memory until the upload is complete. For download, the bytes are kept until the download is complete. So, if I have 1000 users downloading a file at the same time that means that the Azure VM should have 4 GB RAM just for the transfer.

Is there a way to optimize this? Correct me if I'm wrong when I assume that the data is kept in VM machine RAM until the operation is finished. Should I use Microsoft's Azure REST Service? Where does that service keep the data until the transfer is finished?

Solution

I think you can avoid having all the data in memory at once by doing a couple things.

set up your WCF service to use a Streamed TransferMode
Handle your streams appropriately in code. ie, open the stream to BLOB storage, and then copy from the WCF stream to the BLOB stream in chunks, or using Stream.CopyTo().

I tried to implement this about 6 months ago, but I couldn't make WCF Streamed mode work properly in my situation. It seems to be finicky to get working, and may not be possible with HTTP endpoints. Then again, maybe my WCF-fu is just too weak. If you do get something working please post it back here, as I would very much like to see it.

Note that if you have multiple clients downloading the same data at once, you could keep a single copy in memory and stream it to each of them independently. But I don't believe that approach is compatible with the method I just described - you'd have to do one or the other.