Search code examples
c#azurefile-uploadazure-blob-storageazure-data-lake-gen2

Upload large size file more than 2 GB. What will be the best approach?


I want to upload large size file which is more than 2 GB to Azure data lake / blob storage.

I tried with the Azure's cloud blob method PutBlockListAsync. ref: https://www.andrewhoefling.com/Blog/Post/uploading-large-files-to-azure-blob-storage-in-c-sharp

I will check with the grpc.

What are the different approaches I can try out to improve the performance while uploading such huge files? -With chunk upload -Buffered upload -GRPC -AZCopy -Any other technique or Hybrid technique


Solution

  • You can use the Azure data movement library to upload larger files to file share or blob storage.

    I tried in my environment and got below results:

    Code:

    using Microsoft.Azure.Storage;
    using Microsoft.Azure.Storage.Blob;
    using Microsoft.Azure.Storage.DataMovement;
    
    class program
    {
        public static void Main(string[] args)
        {
            string storageConnectionString = "<Connection string>";
            CloudStorageAccount account = CloudStorageAccount.Parse(storageConnectionString);
            CloudBlobClient blobClient = account.CreateCloudBlobClient();
            CloudBlobContainer blobContainer = blobClient.GetContainerReference("test");
            blobContainer.CreateIfNotExists();
            string sourceBlob = @"C:\Users\download\sample.docx";
            CloudBlockBlob destPath = blobContainer.GetBlockBlobReference("sample.docx");
            TransferManager.Configurations.ParallelOperations = 64;
            // Setup the transfer context and track the download progress
            SingleTransferContext context = new SingleTransferContext
            {
                ProgressHandler = new Progress<TransferStatus>(progress =>
                {
                    Console.WriteLine("Bytes Upload: {0}", progress.BytesTransferred);
                })
            };
            // upload the blob
            var task = TransferManager.UploadAsync(
            sourceBlob, destPath, null, context, CancellationToken.None);
            task.Wait();
        }
    }
    

    For workaround I took a file with 100 mb to upload files using above code, also you can also use the chunk uploads and GRPC method together said by Nour.

    Console:

    enter image description here

    Portal:

    enter image description here

    You can also use Azcopy for uploading large files in datalakestorage.