Search code examples
c#stringglobalizationcultureinfo

Looking for String operations edge cases. What do I need to test?


I am getting to the last stage of my rope (a more scalable version of String) implementation. Obviously, I want all operations to give the same result as the operations on Strings whenever possible.

Doing this for ordinal operations is pretty simple, but I am worried about implementing culture-sensitive operations correctly. Especially since I know only two languages and in both of them culture-sensitive operations behave precisely the same as ordinal operations do!

So are there any specific things that I could test and get at least some confidence that I am doing things correctly? I know, for example, about ß being equal to SS when ignoring cases in German; about dotted and undotted i in Turkish.


Solution

  • Surrogate pairs, if you plan to support them - including invalid combinations (e.g. only one part of one).

    If you're doing encoding and decoding, make sure you retain enough state to cope with being given arbitrarily blocks of binary data to decode which may end half way through a character, with the remaining half coming in the next character.