For our current project, i am using the CSVHelper Nuget and everything works perfectly with it with the only exception when the field contains special characters (ä,ü,...). How can I change it to make it work and not show ? as the letter replacement? (I tried Current and Invariant Culture but it didn't matter).
I tried changing the Culture when reading the byte stream from the file and I tried using different Cultures when parsing the CSV.
I often have this issue when someone saves an Excel file as CSV (Comma delimited)(*.csv)
rather than as CSV UTF-8 (Comma delimited)(*.csv)
. Depending on the country it is saved in, this often means it was saved as Windows 1252 encoding. In most cases, you can get away with using ISO-8859-1
encoding, also known as Latin-1
encoding, when reading the file with StreamReader
. If you still have some characters that are not getting read correctly, you may have to use the exact encoding that was used to save the file.
ISO-8859-1 (also called Latin-1) is identical to Windows-1252 (also called CP1252) except for the code points 128-159 (0x80-0x9F). ISO-8859-1 assigns several control codes in this range. Windows-1252 has several characters, punctuation, arithmetic and business symbols assigned to these code points. https://www.i18nqa.com/debug/table-iso8859-1-vs-windows-1252.html
In .NET Core it looks like you are a bit limited as to the number of encodings available to you.
The example produces the following output when run on .NET Core:
Info.CodePage | Info.Name | Info.DisplayName |
---|---|---|
1200 | utf-16 | Unicode |
1201 | utf-16BE | Unicode (Big-Endian) |
12000 | utf-32 | Unicode (UTF-32) |
12001 | utf-32BE | Unicode (UTF-32 Big-Endian) |
20127 | us-ascii | US-ASCII |
28591 | iso-8859-1 | Western European (ISO) |
65000 | utf-7 | Unicode (UTF-7) |
65001 | utf-8 | Unicode (UTF-8) |
void Main()
{
using var reader = new StreamReader(@"C:\Users\myName\Documents\TestUmlauts.csv",
Encoding.Latin1);
using var csv = new CsvReader(reader, CultureInfo.InvariantCulture);
var records = csv.GetRecords<Foo>();
}
public class Foo
{
public int Id { get; set; }
public string Name { get; set; }
}