Search code examples
c#azureazure-data-lakeazure-data-lake-gen2

AzureDataLake Upload Files


I wanted to upload files to AzureDataLake but can't decide which method to use

I found these two overloads of DataLakeFileClient.UploadAsync

As I understand only difference is that we provide path or fileStream to the content. But is there any difference in behavior or efficiency between those two ways of providing content? Or it's just about convenience?


Solution

  • As I understand, the only difference is that we provide a path or fileStream to the content. However, is there any difference in behavior or efficiency between these two ways of providing content, or is it just about convenience?

    DataLakeFileClient.UploadAsync is the method that allows you to upload a file to an Azure Data Lake Storage Gen2 account.

    The difference between the two overloads lies in the way you provide the content of the file to be uploaded.

    UploadAsync(Stream):

    The content of the file to be uploaded is represented as a Stream object, which is the input for the UploadAsync method's overload. If you wish to upload the file content straight to the Data Lake Storage Gen2 account and you already have it in memory as a Stream object, this overload can be helpful.

    using (FileStream fileStream = File.OpenRead(filePath))
    {
        await fileClient.UploadAsync(fileStream, overwrite: true);
    }
    

    UploadAsync(String):

    This UploadAsync method takes a file path as input, which represents the location of the file to be uploaded. This overload is useful if you want to upload a file from disk to the Data Lake Storage Gen2 account.

    fileClient.UploadAsync(filePath, overwrite: true);
    
    • Both overloads should function equally in terms of behavior and performance.
    • The choice between the two overloads depends on your specific use case and how you want to supply the file's content for upload. If the file content is already in memory as a Stream object, it may be more convenient to use the first overload.
    • If you want to upload a file from disk, it may be more convenient to use the second overload.

    Reference: Use .NET to manage data in Azure Data Lake Storage Gen2 - Azure Storage | Microsoft Learn