Search code examples
azure-storageazure-blob-storage

CloudBlockBlob DownloadTextAsync Behavior Difference


I am using an azure function with event grid trigger and CloudBlockBlob as input binding. The content is getting downloaded from CloudBlockBob using DownloadTextAsync(AccessCondition accessCondition, BlobRequestOptions options, OperationContext operationContext)

If the file being downloaded above is being generated using XmlDocument, DownloadTextAsync returns gibberish. However, if the file is generated by using FileStream, it works fine. PFB the implementations of generating the file-

  1. Using XmlDocument
var stringwriter = new System.IO.StringWriter();
var serializer = new XmlSerializer(typeof(List<ContractName>), new XmlRootAttribute("RootAttributeName"));
serializer.Serialize(stringwriter, contractData);
var xmlString = stringwriter.ToString();

XmlDocument doc = new XmlDocument();
doc.LoadXml(xmlString);
doc.PreserveWhitespace = true;
doc.Save(fileName);
  1. Using FileStream
var serializer = new XmlSerializer(typeof(List<ContractName>), new XmlRootAttribute("RootAttributeName"));

var file = new FileStream(fileName, FileMode.OpenOrCreate);           
serializer.Serialize(file, contractData);
file.Close();

Code being used to download the content-

  1. Using DownloadTextAsync
private static async System.Threading.Tasks.Task<string> DownloadContentAsync_DownloadTextAsync(string storageAccountConnectionString, string containerName, string blobName)
        {
            CloudBlobContainer container = GetContainer(storageAccountConnectionString, containerName);
            ICloudBlob blob = await container.GetBlobReferenceFromServerAsync(blobName);

            // Download the blob content
            string xmlBlobContent =
                await (blob as CloudBlockBlob).DownloadTextAsync(
                    null,
                    new BlobRequestOptions { LocationMode = LocationMode.PrimaryThenSecondary },
                    new OperationContext());

            return xmlBlobContent;
        }
  1. Using DownloadToStreamAsync
private static async System.Threading.Tasks.Task<string> DownloadContentAsync_DownloadToStreamAsync(string storageAccountConnectionString, string containerName, string blobName)
        {
            CloudBlobContainer container = GetContainer(storageAccountConnectionString, containerName);
            ICloudBlob blob = await container.GetBlobReferenceFromServerAsync(blobName);

            // Download the blob content
            MemoryStream resultStream = new MemoryStream();
            await (blob as CloudBlockBlob).DownloadToStreamAsync(
                resultStream,
                null,
                new BlobRequestOptions { LocationMode = LocationMode.PrimaryThenSecondary },
                new OperationContext());
            string xmlBlobContent = System.Text.Encoding.UTF8.GetString(resultStream.ToArray());

            return xmlBlobContent;
        }

Why there is a difference in response from DownloadTextAsync.


Solution

  • Updated 0713:

    Figured it out. The root cause is that when you're using XmlDocument to generate the xml file, the encoding is utf-16. But for FileStream, it generates the xml file with encoding utf-8.

    So, the solution is that, when using XmlDocument, we can specify the encoding to utf-8(no code change for FileStream). Sample code as below:

    Generate xml file using XmlDocument:

                   //2. Using XMLDoc
                    serializer.Serialize(stringwriter, contractData);
                    var xmlString = stringwriter.ToString();
                    XmlDocument doc = new XmlDocument();
    
                    doc.LoadXml(xmlString);
                    doc.PreserveWhitespace = true;
                    string fileName = String.Format(@"C:\TestBlobDownloadContent\UsingXMLDoc" + count + ".xml");
    
                    //encoding as utf-8
                    using (TextWriter sw = new StreamWriter(fileName, false, Encoding.UTF8))
                    {
                        doc.Save(sw);
                    }
    

    When read the xml file from blob storage via DownloadTextAsync() method, no need to specify the encoding option, like below:

            // Download the blob content
            string xmlBlobContent =
                await (blob as CloudBlockBlob).DownloadTextAsync(
                    null,
                    new BlobRequestOptions { LocationMode = LocationMode.PrimaryThenSecondary },
                    new OperationContext());
    

    Original answer:

    This is due to the encode/decode issue.

    Solution:

    In the DownloadTextAsync() method, add parameter System.Text.Encoding.Unicode. Like below:

     string xmlBlobContent =
                 await (blob as CloudBlockBlob).DownloadTextAsync(
                                System.Text.Encoding.Unicode,
                                null,
                                new BlobRequestOptions { LocationMode = LocationMode.PrimaryThenSecondary },
                                new OperationContext());
    

    The test result:

    enter image description here