Search code examples
c#encoding

How to convert encoding from 870 to 1250 in C#


I am struggling with encoding convertion from 870 to 1250. I want to read a file in binary mode in chunks and then convert them to CP1250. So far I've prepared the code as below, but it doesn't support the less popular encoding ( I also installed a nuget package (System.Text.Encoding), but don't know, how to use it).

Encoding eIBM870 = Encoding.GetEncoding(870);
Encoding eCP1250 = Encoding.GetEncoding(1250);

FileStream fInput = File.OpenRead(fileInput.Text);
BinaryReader binReader = new BinaryReader(fInput);
byte[] bytes = binReader.ReadBytes(50);  //e.g. 50 bytes

byte[] converted = Encoding.Convert(eIBM870, eCP1250, bytes);

String output = eCP1250.GetString(converted);

So, I kindly ask for help.


Solution

  • If you use .Net Core (not .Net Framework) call

    Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
    

    For instance

    // Let .Net use platform specific code pages ("support the less popular encoding")
    Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
    
    Encoding eIBM870 = Encoding.GetEncoding(870);
    Encoding eCP1250 = Encoding.GetEncoding(1250);
    
    // using : don't forget do Dispose reader (and stream)
    using BinaryReader binReader = new BinaryReader(File.OpenRead(fileInput.Text));
    
    byte[] bytes = binReader.ReadBytes(50); 
    
    byte[] converted = Encoding.Convert(eIBM870, eCP1250, bytes);
    
    String output = eCP1250.GetString(converted);
    

    Please note, that working with string(s) is often a better option (some encoding are multibytes, so reading 50 can read an invalid string):

    Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
    
    Encoding eIBM870 = Encoding.GetEncoding(870);
    Encoding eCP1250 = Encoding.GetEncoding(1250);
    
    using StreamReader reader = new StreamReader(fileInput.Text, eIBM870);
    
    string output = reader.ReadLine();
    
    byte[] converted = eCP1250.GetBytes(output);