Is there a way to remove non-ascii characters with configuration in CsvHelper
instead of writing the conversion in application code?
I saved an Excel to CSV and found some values like AbsMarketValue�������������
and I would like to get rid of the non-ASCII characters.
csv.Configuration.Encoding = Encoding.ASCII
did not work.
With reference to How can you strip non-ASCII characters from a string? (in C#)
string s = "søme string";
s = Regex.Replace(s, @"[^\u0000-\u007F]+", string.Empty);
The above approach works for me but I want to avoid this since this requires me to add this type of code in application for any text field.
I tried to do this in the conversion map but that did not work.
Using a type converter, you could have all string properties only output ASCII characters.
void Main()
{
using (var reader = new StringReader("Id,Name\n1,AbsMarketValue�������������"))
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
csv.Context.TypeConverterCache.AddConverter<string>(new AsciiOnlyConverter());
var records = csv.GetRecords<Foo>();
}
}
public class Foo
{
public int Id { get; set; }
public string Name { get; set; }
}
public class AsciiOnlyConverter : StringConverter
{
public override object ConvertFromString(string text, IReaderRow row, MemberMapData memberMapData)
{
var ascii = Regex.Replace(text, @"[^\u0000-\u007F]+", string.Empty);
return base.ConvertFromString(ascii, row, memberMapData);
}
}