Search code examples
c#csvcsvhelpercsvreader

How to detect if a row has extra columns (more than the header)


While reading a CSV file, how can I configure CsvHelper to enforce that each row has no extra columns that are not found in the header? I cannot find any obvious property under CsvConfiguration nor under CsvHelper.Configuration.Attributes.

Context: In our CSV file format, the last column is a string description, which our users (using plain-text editors) sometimes forget to quote when the description contains commas. Such "raw" commas cause that row to have extra columns, and the intended description read into the software omits the description after the first raw comma. I want to detect this and throw an exception that suggests to the user they may have forgotten to quote the description cell.

It looks like CsvConfiguration.DetectColumnCountChanges might be related, but presently the 29.0.0 library lacks any Intellisense description of CsvConfiguration properties, so I have no idea how to use this.

Similar information for other CSV libraries:


Solution

  • You were on the right track with CsvConfiguration.DetectColumnCountChanges.

    void Main()
    {
        var config = new CsvConfiguration(CultureInfo.InvariantCulture)
        {
            DetectColumnCountChanges = true
        };
        
        using (var reader = new StringReader("Id,Name\n1,MyName\n2,YourName,ExtraColumn"))
        using (var csv = new CsvReader(reader, config))
        {
            try
            {           
                var records = csv.GetRecords<Foo>().ToList();
            }
            catch (BadDataException ex)
            {
                if (ex.Message.StartsWith("An inconsistent number of columns has been detected."))
                {
                    Console.WriteLine("There is an issue with an inconsistent number of columns on row {0}", ex.Context.Parser.RawRow);
                    Console.WriteLine("Row data: \"{0}\"", ex.Context.Parser.RawRecord);
                    Console.WriteLine("Please check for commas in a field that were not properly quoted.");
                } 
            }
            
        }
    }
    
    public class Foo
    {
        public int Id { get; set; }
        public string Name { get; set; }
    }