Decoding gb18030 to UTF8 in C#

I have a text file, the contents if open in notepad shows:

Ê¸³ßÓÀ¼ª

If I drag it to chrome browser, it automatically decode and display correctly as

矢尺永吉

After a bit of research, the code in the file is encoded with gb18030. I am attempting to do the conversion in C#. Below is my code:

public static string codeCovert(string s)
    {
        Encoding gb18 = Encoding.GetEncoding("gb18030");
        Encoding Utf8 = Encoding.UTF8;

        byte[] gbcode = gb18.GetBytes(s);

        return Utf8.GetString(gbcode);      
    }

And this still gives a whole bunch of wrong characters. Can anyone help please? Thanks.

Solution

Your method takes in a string and returns another string which does not make sense. System.String is a "vector" of UTF-16 code units.

You should do:

using System.Text;
using System.IO;

// ...

  var str = File.ReadAllText(@"path\file.txt", Encoding.GetEncoding("GB18030"));

While str is in memory, it has the value "矢尺永吉". It cannot be "UTF-8" when it is a .NET string in memory. You can save it to another file, of course:

  File.WriteAllText(@"path\otherfile.txt", str, Encoding.UTF8);

Edit: In newer versions of .NET, you need to do:

Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);

before you can use Encoding.GetEncoding("GB18030").