Search code examples
csvhelpercsvreader

How to specify default int value for empty strings when using CsvDataReader in CsvHelper


I want to use CsvReader and CsvDataReader to load the input CSV into a DataTable, where the colums in the destination data table defines the data types. But for columns e.g. of int type, the CSV parsing fails when an empty string (;;) is found in a CSV field.

How can i specify default int value (0) for the Credits column when empty string is found in the CSV input?

      // Construct the datatable
      DataTable dataTable = new DataTable();
      dataTable.Columns.Add(new DataColumn("Name", typeof(string)));
      dataTable.Columns.Add(new DataColumn("Credits", typeof(int)));
      dataTable.Columns.Add(new DataColumn("Notes", typeof(string)));
      dataTable.Columns.Add(new DataColumn("LastLogin", typeof(DateTime)));
      dataTable.Columns.Add(new DataColumn("Balance", typeof(decimal)));

      // Prepare CSV content
      StringBuilder sbInput = new StringBuilder();
      sbInput.AppendLine("Name;Credits;Notes;LastLogin;Balance");
      sbInput.AppendLine("Michael;2433;Good guy;2023-03-28;736,26");
      sbInput.AppendLine("John;;Bad guy;2023-04-01;-49,25");

      // Prepare the configuration
      CsvConfiguration csvConfiguration = new CsvConfiguration(CultureInfo.InvariantCulture)
      {
        NewLine = Environment.NewLine,
        Delimiter = ";",
        Mode = CsvMode.NoEscape,
        TrimOptions = TrimOptions.Trim,
        UseNewObjectForNullReferenceMembers = true
      };

      // Read the CSV stream
      using (var csvReader = new CsvReader(new StringReader(sbInput.ToString()), csvConfiguration))
      {
        using (var csvDataReader = new CsvDataReader(csvReader))
        {
            // Following exception thrown if John;;Bad guy 
            //   Input string was not in a correct format.Couldn't store <> in Credits Column.  Expected type is Int32.
            dataTable.Load(csvDataReader);
        }
      }

Solution

  • There are issues with CsvDataReader where it does not obey MissingFieldFound options or TypeConverters. This is briefly explained here: Handling DateTime fields from CsvDataReader that contains an empty string

    A better solution in this modern era would be to abandon DataTable altogether an switch over to using a fully typed collection that you can bind to your data table or directly to the control as a datasource. - that discussion is out of scope...

    Instead, we can bypass the CsvDataReader and use a TypeConverter to return the default value for the type if the string is empty:

    https://dotnetfiddle.net/maLOt0

    /// <summary>
    /// Converts an <see cref="int"/> or <see cref="decimal"/> to and from a <see cref="string"/>. and treats empty strings as default values
    /// </summary>
    public class DefaultNumericConverter<T> : DefaultTypeConverter
    {
        /// <summary>
        /// Converts the string to an object. int or decimal defaulting to zero.
        /// </summary>
        /// <remarks>Will revert to default implementation for non int or decimal expected types</remarks>
        /// <param name="text">The string to convert to an object.</param>
        /// <param name="row">The <see cref="IReaderRow"/> for the current record.</param>
        /// <param name="memberMapData">The <see cref="MemberMapData"/> for the member being created.</param>
        /// <returns>The object created from the string.</returns>
        public override object ConvertFromString(string text, IReaderRow row, MemberMapData memberMapData)
        {
            if (String.IsNullOrWhiteSpace(text))
                return default(T);
            
            var type = typeof(T);
            if (type == typeof(int))
            {
                if (int.TryParse(text, memberMapData.TypeConverterOptions.NumberStyles ?? NumberStyles.Integer, memberMapData.TypeConverterOptions.CultureInfo, out var i))
                {
                    return i;
                }
            }
            else if (type == typeof(decimal))
            {
                if (decimal.TryParse(text, memberMapData.TypeConverterOptions.NumberStyles ?? NumberStyles.Currency, memberMapData.TypeConverterOptions.CultureInfo, out var m))
                {
                    return m;
                }
            }
            return base.ConvertFromString(text, row, memberMapData);
        }
    }
    

    Then we can replace the default type converters for the types we want to default. You could do this in a more generic sense for multiple or all types, but that would be overkill if you didn't specifically need it.

        // Read the CSV stream
        using (var csvReader = new CsvReader(new StringReader(sbInput.ToString()), csvConfiguration))
        {
            csvReader.Context.TypeConverterCache.RemoveConverter<int>();
            csvReader.Context.TypeConverterCache.AddConverter<int>(new DefaultNumericConverter<int>());
            csvReader.Context.TypeConverterCache.RemoveConverter<decimal>();
            csvReader.Context.TypeConverterCache.AddConverter<decimal>(new DefaultNumericConverter<decimal>());
    
            // Unfortunately, it looks like CsvDataReader treats all values as strings and ignores the TypeConverters for other types. 
            // So... don't use it. https://stackoverflow.com/a/56064792/1690217
            csvReader.Read();
            csvReader.ReadHeader();
            while(csvReader.Read())
            {
                var row = dataTable.NewRow();
                foreach (DataColumn column in dataTable.Columns)
                {
                    row[column.ColumnName] = csvReader.GetField(column.DataType, column.ColumnName);
                }
                dataTable.Rows.Add(row);
            }
        }