Coming from this question I'm wondering why ä
and ae
are different(which makes sense) but ß
and ss
are treated as equal. I haven't found an answer on SO even if this question seems to be related and even mentions "that ß
will compare equal to SS
in Germany, or similar" but not why.
The only resource on MSDN I found was this: How to: Compare Strings
Here is mentioned following but also lacks the why:
// "They dance in the street."
// Linguistically (in Windows), "ss" is equal to
// the German essetz: 'ß' character in both en-US and de-DE cultures.
.....
So why does this evaluate to true
, both with de-DE
culture or any other culture:
var ci = new CultureInfo("de-DE");
int result = ci.CompareInfo.Compare("strasse", "straße", CompareOptions.IgnoreNonSpace); // 0
bool equals = String.Equals("strasse", "straße", StringComparison.CurrentCulture); // true
equals = String.Equals("strasse", "straße", StringComparison.InvariantCulture); // true
If you look at the Ä page, you'll see that not always Ä is a replacement for Æ (or ae), and it is still used in various languages.
The letter ß instead:
While the letter "ß" has been used in other languages, it is now only used in German. However, it is not used in Switzerland, Liechtenstein or Namibia.[1] German speakers in Germany, Austria, Belgium,[2] Denmark,[3] Luxembourg[4] and South Tyrol, Italy[5] follow the standard rules for ß.
So the ß is used in a single language, with a single rule (ß == ss), while the Ä is used in multiple languages with multiple rules.
Note that, considering that case folding is:
Case folding is primarily used for caseless comparison of text, such as identifiers in a computer program, rather than actual text transformation
The official Unicode 7.0 Case Folding Properties tells us that
00DF; F; 0073 0073; # LATIN SMALL LETTER SHARP S
where 00DF is ß and 0073 is s, so ß can be considered, for caseless comparison, as ss.