Search code examples
c#filestreamxmlreader

Unable to parse the unicode character


I am trying to read the xml file using xml reader. I have created a dictionary to store the mime-type and its corresponding extension. I have stored the mime-type in this format. <MimeToExtension MimeType="image/x‑portable‑bitmap" Extension=".pbm" />.

When I try to fetch the value form dictionary using the key "image/x‑portable‑bitmap". It doesn't return any value. Because "image/x-portable-bitmap" is saved as

When pasted the value in notepad++

The - character is changed into square brackets. How can I resolve this?

FileStream filestream = File.OpenRead(mimeTypeToExtension);
using (XmlReader reader = XmlReader.Create(filestream))
{
    while (reader.Read())
    {
        if (reader.NodeType == XmlNodeType.Element)
        {
            if (reader.HasAttributes && reader.AttributeCount == 2)
            {
                string extension = reader.GetAttribute(0);
                string mimeType = reader.GetAttribute(1);
                if (!string.IsNullOrEmpty(mimeType) && !string.IsNullOrEmpty(extension) &&
                    !fileTypes.ContainsKey(extension))
                    fileTypes.Add(extension, mimeType);
            }
        }
    }
}

Solution

  • That's because you propably copy and pasted the mime type from somewhere on the internet and got the wrong hyphen.

    You hyphen is a non-breaking-hyphen (Unicode 0x2011). You want a regular hyphen (Unicode 0x2d). Just manually replace all hyphens in your code or just copy this:

    "image/x-portable-bitmap"
    

    Always be careful when copying code/text/etc. from the internet. This issue often occurs with quotes too, because most CMS don't account for programmers' needs and just replace some characters to make them "look better" or to avoid formatting issues.