Search code examples
c#unicodeencodingutf-8utf-16

internal encoding for my application


My desktop c# application gets various documents from users, possibly in different encodings.

I need to show users existing documents, allow to manipulate them in my UI, and store them for future use.

Adding the notion of "encoding" to each of these steps seems complex to me. I was thinking to internally always convert the user input documents to UTF-8, and so my UI and data store do not need to worry about it. Then when the user wants the document back as a file I ask the user which encoding to use.

Does this make sense? Are encodings interoperable? What if I only support unicode?


Solution

  • Encodings are not interoperable, since some have characters that others don't have.

    Unicode internal representation is a good idea since it has the wider charset, but I'd advice to save back the document in the original encoding if the added characters are still in the said encoding. If not, prompt the user that you'll save in Unicode in order to encode correctly these characters.