Search code examples
node.jsprogress-barazure-storage

Azure Blob Storage - upload file with progress


I have following code - quite normal for uploading files into Azure-Blob-Storage but, when i upload files instead of getting onProgress executed many times, i only have it executed (and always) once with the file.size value (so it is sending - slowly) file to the azure but progress executes only once when finished.

    const requestOptions = this.mergeWithDefaultOptions(perRequestOptions);
    const client = this.getRequestClient(requestOptions);
    const containerClient = await client.getContainerClient(this.options.containerName);
    const blobClient = await containerClient.getBlockBlobClient(file.name);
    const uploadStatus = await blobClient.upload(file.buffer, file.size, {onProgress: progressCallBack});

What i would love to know is if that outcome is normal for this library (for downloading files from azure, the same approach works correctly).


Solution

  • According to my test, the method is a non-parallel uploading method and it just sends a single Put Blob request to Azure Storage server. For more details, please refer to here. enter image description here

    So if you want to get onProgress executed many times, I suggest you use the method uploadStream. It uses Put Block operation and Put Block List operation to upload. For more details, please refer to here

    For example

    try {
        var creds = new StorageSharedKeyCredential(accountName, accountKey);
        var blobServiceClient = new BlobServiceClient(
          `https://${accountName}.blob.core.windows.net`,
          creds
        );
        var containerClient = blobServiceClient.getContainerClient("upload");
        var blob = containerClient.getBlockBlobClient(
          "spark-3.0.1-bin-hadoop3.2.tgz"
        );
    
        var maxConcurrency = 20; // max uploading concurrency
        var blockSize = 4 * 1024 * 1024; // the block size in the uploaded block blob
        var res = await blob.uploadStream(
          fs.createReadStream("d:/spark-3.0.1-bin-hadoop3.2.tgz", {
            highWaterMark: blockSize,
          }),
          blockSize,
          maxConcurrency,
          { onProgress: (ev) => console.log(ev) }
        );
        console.log(res._response.status);
      } catch (error) {
        console.log(error);
      }
    

    enter image description here