Search code examples
c#csvcsvhelper

Handling bad CSV records in CsvHelper


I would like to be able to iterate through all records in a CSV file and add all the good records to one collection and handle all the "bad" ones separately. I don't seem to be able to do this and I think I must be missing something.

If I attempt to catch the BadDataException then subsequent reads will fail meaning I cannot carry on and read the rest of the file -

while (true)
{
    try
    {
        if (!reader.Read())
            break;

        var record = reader.GetRecord<Record>();
        goodList.Add(record);
    }
    catch (BadDataException ex)
    {
        // Exception is caught but I won't be able to read further rows in file
        // (all further reader.Read() result in same exception thrown)
        Console.WriteLine(ex.Message);
    }
}

The other option discussed is setting the BadDataFound callback action to handle it -

reader.Configuration.BadDataFound = x =>
{
    Console.WriteLine($"Bad data: <{x.RawRecord}>");
};

However although the callback is called the bad record still ends up in my "good list"

Is there some way I can query the reader to see if the record is good before adding it to my list?

For this example my Record definition is -

class Record
{
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public int Age { get; set; }
}

And the data (first row bad, second row good) -

"Jo"hn","Doe",43
"Jane","Doe",21

Interestingly handling a missing field with MissingFieldException seems to function exactly as I would like - the exception is thrown but subsequent rows are still read ok.


Solution

  • Here is the example I supplied.

    void Main()
    {
        using (var stream = new MemoryStream())
        using (var writer = new StreamWriter(stream))
        using (var reader = new StreamReader(stream))
        using (var csv = new CsvReader(reader))
        {
            writer.WriteLine("FirstName,LastName");
            writer.WriteLine("\"Jon\"hn\"\",\"Doe\"");
            writer.WriteLine("\"Jane\",\"Doe\"");
            writer.Flush();
            stream.Position = 0;
    
            var good = new List<Test>();
            var bad = new List<string>();
            var isRecordBad = false;
            csv.Configuration.BadDataFound = context =>
            {
                isRecordBad = true;
                bad.Add(context.RawRecord);
            };
            while (csv.Read())
            {
                var record = csv.GetRecord<Test>();
                if (!isRecordBad)
                {
                    good.Add(record);
                }
    
                isRecordBad = false;
            }
    
            good.Dump();
            bad.Dump();
        }
    }
    
    public class Test
    {
        public string FirstName { get; set; }
        public string LastName { get; set; }
    }