Search code examples
c#stringunit-testingglobalizationculture

Designing Tests for Culture/Globalisation problems


I'm concerned about the predictability of my application in handling string input in different cultures. It has been a problem in older software and I don't want it to be a problem in the new.

I have generally two sources of input; Strings entered into a WPF application and Streams, loaded from files, containing text. These cultured strings are generally entered into an model before being used

public struct MyModel
{
    public String Name;
}

I want to design a meaningful test to ensure some logic can actually handle Result DoSomething(MyModel model); when it contains text inputted on a different machine.

But how can I show a case where the difference matters?

For example the following fails.

 var inNativeCulture= "[Something12345678.9:1] {YeS/nO}";
 var inChineseCulture = inNativeCulture.ToString(new CultureInfo("zh-CN"));
 Assert.That(inChineseCulture, Is.Not.EqualTo(inNativeCulture));

[Question]

How can I test DoSomething such that the test is able to fail if the strings are not converted to InvarientCulture?

Should I even bother? i.e. the string Something entered on a french keyboard will always equal Something entered on a Chinese keyboard?

What can I test for that will mitigate Globalization problems?


Solution

  • The ToString method taking a IFormatProvider on a string is essentially a no-op. The documentation states "Returns this instance of String; no actual conversion is performed."

    Since you are concerned about avoiding issues here's some general advice. First it is very helpful to have a clear distinction in your mind between frontend (user facing) strings and backend (database, wire, file, etc) strings. Frontend strings should be generated/accepted according to the user's culture / application language. These strings should not be persisted (with few exceptions like when you are generating a document that will be read only by people and not by machine). Backend strings should always use standard formats that will not change over time. If you accept the fact that the data used to generate/parse globalized strings changes, then you will isolate yourself from the effects by ensuring that you do not persist user facing strings.