Search code examples
tridiontridion-2011

Read UTF-8 content from the XML file in Tridion multimedia component - Templating C#


I am trying to read an XML file content embedded in Multimedia Component through templating(C#). The XML file contains few UTF-8 characters. When I read the xml content, the the output is translating the UTF-8 characters into some garbage characters(? symbols or rectangle boxes). Below is the code snippet that I used in C# Templating.

Code 1:

Component xmlMultimediaComponent = (Component)XMLMMSRepositoryObject;
// read xml in multimedia component into a string
UTF8Encoding encoding = new UTF8Encoding();
byte[] binary = xmlMultimediaComponent.BinaryContent.GetByteArray();
string navXmlContent = (binary != null) 
               ? UTF8Encoding.UTF8.GetString(binary, 0, binary.Length) 
                       : string.Empty;           

Code 2:

using (MemoryStream ms = new MemoryStream())
{
  xmlMultimediaComponent.BinaryContent.WriteToStream(ms);
  ms.Seek(0, SeekOrigin.Begin);

  using (var streamReader = new StreamReader(ms, Encoding.UTF8))
  {                      
    string output = streamReader.ReadToEnd();
      ....
  }
}

In both of the above cases, the output is having garbage characters(for UTF-8 encoded).

Any idea how to get the same UTF-8 content into the string output from the XML file in Tridion multimedia component.

Note: The XML File in the multimedia component is saved with UTF-8 encoding.

Thanks in advance.


Solution

  • On further investigation, we noticed that the file associated in Multimedia component is ASCII encoded. So there must not be explicit conversion to UTF-8 while reading its contents, and it should go with default encoding(i.e, ASCII in above case).

           Component xmlMultimediaComponent = XMLMMSRepositoryObject as Component;               
           byte[] binary = xmlMultimediaComponent.BinaryContent.GetByteArray();
           string navContent = (binary != null) ? Encoding.GetEncoding("ASCII") : string.Empty;