To be more precise, the latest version of c# (c# 12 (.NET 8.0)), does it use UTF-8 or UTF-16 for strings?
I am confused because: https://learn.microsoft.com/en-us/dotnet/standard/base-types/character-encoding-introduction
A string is logically a sequence of 16-bit values, each of which is an instance of the char struct.
And here: https://learn.microsoft.com/en-us/dotnet/core/compatibility/globalization/5.0/icu-globalization-api
.NET 5 and later versions use International Components for Unicode (ICU) libraries for globalization functionality when running on Windows 10 May 2019 Update or later.
And what if run on Linux? Do I have to provide the ICU lib? Or is the statement, c# still uses 16-bit values and deletes the zeros for all latin languages and maps this than to the ICU?
In C#, strings are stored internally as UTF-16 encoded. This means that each character in a string occupies 16 bits of memory. String always contains Unicode (or more precisely, UTF-16).