I have string with different codepages: string multi = "EnglishРусский日本語";
I need to return list of codepages:
int[] GetCodePage(string multi)
{
return new int[] {1252, 1251, 932};
}
From your comments, it seems that your problem is different.
If you only need to check if a filename (a string
) uses only characters from the "default codepage" (the Windows api uses unicode plus a single non unicode codepage, that is the default codepage for non-unicode programs), then it is quite simple. Encoding.Default
is the Windows non-unicode codepage.
public static void Main()
{
Console.WriteLine(Encoding.Default.BodyName);
// I live in Italy, we use the Windows-1252 as the default codepage
Console.WriteLine(CanBeEncoded(Encoding.Default, "Hello world àèéìòù"));
Console.WriteLine(CanBeEncoded(Encoding.Default, "Русский"));
}
and the interesting code:
public static bool CanBeEncoded(Encoding enc, string str)
{
// We want to modify the Encoding, so we have to clone it
enc = (Encoding)enc.Clone();
enc.EncoderFallback = new EncoderExceptionFallback();
try
{
enc.GetByteCount(str);
}
catch (EncoderFallbackException)
{
return false;
}
return true;
}
Note that this code could be optimized. Using an exception to check for the fact that the string can be encoded isn't optimal (but it is easy to write :-) ). A better solution would be to subclass the EncoderFallback
class.