Search code examples
c#encodingiso-8859-1csvhelperbyte-order-mark

c# encoding issues with CsvHelper


var orders = new List<Order>();
....
orders.Add(...)


string csvstring;
using (var ms = new MemoryStream())
using (var wr = new StreamWriter(stream, Encoding.UTF8))
using (var csvWriter = new CsvWriter(wr, CultureInfo.InvariantCulture, false))
{
    

csvWriter.WriteRecords(orders);
    csvstring = Encoding.UTF8.GetString(stream.ToArray());
}

And then

sftp.WriteAllText(fileNameAbsolutePath, csvstring, Encoding.UTF8);

The content of the file created in sftp has "feff" in the begining. " orders.csv: text/plain; charset=utf-8".

This is the first part of the problem. What I am looking is to convert this UTF8 to IS0-8859-1 as the charset expected in the end file is IS0-8859-1.

May be I should do something like this ?

byte[] bytesSS = Encoding.Convert(Encoding.UTF8, Encoding.GetEncoding("ISO-8859-1"), Encoding.UTF8.GetBytes(csvstring));

string s1 = Encoding.GetEncoding("ISO-8859-1").GetString(bytesSS, 0, bytesSS.Length);

Tried to google for "<feff>" and I quite didn't get the concept of BOM and a way to fix this.


Solution

  • I have no idea which SFTP class you use as .NET itself doesn't have an SFTP client. I'll assume you use this one simply because it came first in a Google search for sftp WriteAllText.

    If you want to create a file with a specific encoding, specify it in the StreamWriter constructor instead of UTF8 :

    using (var ms = new MemoryStream())
    using (var wr = new StreamWriter(stream, Encoding.GetEncoding("ISO-8859-1")))
    using (var csvWriter = new CsvWriter(wr, CultureInfo.InvariantCulture, false))
    {    
        csvWriter.WriteRecords(orders);
    }
    

    On the other hand, UTF8 and Latin1 (or any codepage) use the exact same values for characters in the range 0-127. If you want to send only English text, there won't be any difference no matter which encoding you use. If the actual requirement is to create a UTF8 file without a BOM, you can specify it by using the appropriate UTF8Encoding constructor :

    var utf8NoBom=new UTF8Encoding(false);
    using (var ms = new MemoryStream())
    using (var wr = new StreamWriter(stream, utf8NoBom)))
    using (var csvWriter = new CsvWriter(wr, CultureInfo.InvariantCulture, false))
    {    
        csvWriter.WriteRecords(orders);
    }
    

    All SFTP clients have (or should have) a way to upload data using a stream. This means you can use Stream.CopyTo to copy data from the memory stream to the upload stream. Assuming OpenWrite is available, you can modify the code to:

    using (var ms = new MemoryStream())
    {
        using (var wr = new StreamWriter(stream, Encoding.GetEncoding("ISO-8859-1")))
        using (var csvWriter = new CsvWriter(wr, CultureInfo.InvariantCulture, false))
        {    
            csvWriter.WriteRecords(orders);
        }
    
        ms.Position=0;
    
        using(var stream=sftp.OpenWrite(somePath))
        {
            ms.CopyTo(stream);
        }
    }
    

    When the CsvHelper completes, the MemoryStream's position is at the end of the stream and CopyTo wouldn't copy anything. By using ms.Position you move the position to the start of the stream.