Search code examples
c#.netparsingcsvcsvhelper

Using CsvHelper can I translate white space to a nullable?


I have some really lame Csv files I need to parse. I am using CsvHelper and it is working awesome. Except I have some lines that have whitespace where normaly I have a double.

File:

Text,SomeDouble,MoreText

"Good",1.23,"Good"

"Bad", ,"Bad"

if I try and map this into

public class Test
{
  [CsvField(Name = "Text")]
  public string Text { get; set; }

  [CsvField(Name = "SomeDouble")]
  public double? SomeDouble{ get; set; }

  [CsvField(Name = "MoreText")]
  public string MoreText{ get; set; }
}

then I get an error like this:

CsvHelper.CsvReaderException: An error occurred trying to read a record of type

Row: '2' (1-based)

Field Index: '1' (0-based)

Field Name: 'SomeDouble'

Field Value: ' '

System.Exception: is not a valid value for Double. ---> System.FormatException: Input string was not in a correct format.
at System.Number.ParseDouble(String value, NumberStyles options, NumberFormatInfo numfmt) at System.ComponentModel.DoubleConverter.FromString(String value, NumberFormatInfo formatInfo) at System.ComponentModel.BaseNumberConverter.ConvertFrom(ITypeDescriptorContext context, CultureInfo culture, Object value) --- End of inner exception stack trace --- at System.ComponentModel.BaseNumberConverter.ConvertFrom(ITypeDescriptorContext context, CultureInfo culture, Object value) at System.ComponentModel.NullableConverter.ConvertFrom(ITypeDescriptorContext context, CultureInfo culture, Object value) at lambda_method(Closure , ICsvReader ) at CsvHelper.CsvReader.d__0`1.MoveNext()

As I see it, my options are to create a custom parser, or map my value into a string property and do the parsing there.

Are there any other options?

It would be nice if I could configure that I want to treat white space as null.

As requested, here is a code sample that reproduces the problem

 static class Program
    {
        public class Test
        {
            [CsvField(Name = "Text")]
            public string Text { get; set; }

            [CsvField(Name = "SomeDouble")]
            public double? SomeDouble { get; set; }

            [CsvField(Name = "MoreText")]
            public string MoreText { get; set; }
        }

        static void Main(string[] args)
        {
            // create fake in memory file
            var memoryStream = new MemoryStream();
            var streamWriter = new StreamWriter(memoryStream);
            streamWriter.WriteLine("Text,SomeDouble,MoreText");
            streamWriter.WriteLine("Good, 1.23, Good");
            streamWriter.WriteLine("Bad, ,Bad");

            streamWriter.Flush();

            //reset the file to the begining
            memoryStream.Position = 0;

            using (
                var csv =
                    new CsvReader(
                        new StreamReader(memoryStream)))
            {
                // this call will blow up with the exception.
                var records = csv.GetRecords<Test>().ToList();

                //carry on and do stuff with 'records'...
            }
    }

Thanks.


Solution

  • In the end I went with creating my own type converter that will treat whitespace the same as a null.

    public class WhiteSpaceToNullableTypeConverter<T> : TypeConverter where T : struct
    {
        public override bool CanConvertFrom(ITypeDescriptorContext context, Type sourceType)
        {
            return sourceType == typeof (string);
        }
    
        public override bool CanConvertTo(ITypeDescriptorContext context, Type destinationType)
        {
            return destinationType == typeof (T?);
        }
    
        public override object ConvertFrom(ITypeDescriptorContext context, System.Globalization.CultureInfo culture,
                                           object value)
        {
            T? result = null;
    
            var stringValue = (string) value;
            if (!string.IsNullOrWhiteSpace(stringValue))
            {
                var converter = TypeDescriptor.GetConverter(typeof(T));
                result = (T)converter.ConvertFrom(stringValue.Trim());
            }
    
            return result;
        }
    
        public override object ConvertTo(ITypeDescriptorContext context, System.Globalization.CultureInfo culture,
                                         object value, Type destinationType)
        {
            var result = (T?) value;
            return result.ToString();
        }
    }
    

    Apply it to your model like this

    public class Test
    {
        [CsvField(Name = "Text")]
        public string Text { get; set; }
    
        [CsvField(Name = "SomeDouble")]
        [TypeConverter( typeof( WhiteSpaceToNullableTypeConverter<Double> ) )]
        public double? SomeDouble{ get; set; }
    
        [CsvField(Name = "MoreText")]
        public string MoreText{ get; set; }
    }