Search code examples
csvhelper

CsvHelper - Perform logic while reading records


I'd like to apply logic while reading a file, such as cleaning a string of special characters, or determining the difference between two dates if a field is blank.

I've tried this in my map:

public InboundPlacementMap()
    {
        AutoMap();
        Map(m => m.HomePhone).ConvertUsing(
            rec =>
            {
                return CleanPhoneNumber(rec.HomePhone);
            });

        Map(m => m.LengthOfStay).Name("Length of Stay").ConvertUsing(
            row =>
            {
                if (row.LengthOfStay > 0)
                    return (int)row.LengthOfStay;
                return (row.AdmitDate - row.DischargeDate+500).Days;
            });
    }

    private string CleanPhoneNumber(string phone)
    {
        //Do some logic to remove characters, etc.
        return phone;
    }

(In reality, CleanPhoneNumber sits in a different library which is used across projects.) But calling that had a smell to it which I didn't like, as well as it not seeming to work:

 Map(m => m.PatHomePhone).ConvertUsing(rec =>
    { return PamUtility.Utilities.CleanPhoneNumber(rec.PatHomePhone); });

In my method where I'm reading, I'm using GetRecords<>() to read everything at once. Am I better off reading the records one by one, and performing my logic as after I read each one? (That seems messy to me.)

            List<InboundPlacementFileRecord> allRecords = new List<InboundPlacementFileRecord>();
        using (TextReader textReader = File.OpenText(fileToRead))
        {
            var csv = new CsvReader(textReader);
            csv.Configuration.Delimiter = "|";
            csv.Configuration.IgnoreBlankLines = true;
            csv.Configuration.PrepareHeaderForMatch = 
                        header => header.Replace(" ", string.Empty);
            csv.Configuration.HeaderValidated= null;
            csv.Configuration.MissingFieldFound = null;
            csv.Configuration.RegisterClassMap<InboundPlacementMap>();
            allRecords = csv.GetRecords<InboundPlacementFileRecord>().ToList();
        }

EDIT: For reference, here is what the record-by-by record would look like, which if there is more logic to perform, would rapidly get ugly, hence my desire to put it in the mapping:

            while (csv.Read())
            {
                var record = csv.GetRecord<InboundPlacementFileRecord>();

                record.LengthOfStay=(record.LengthOfStay>0)? record.LengthOfStay : 
                    (int)(record.DischargeDate-record.AdmitDate).TotalDays;

                // ... other logic here ...

                allRecords.Add(record);
            }

(Using latest CsvHelper, 7.1.0 at the time of this question.)


Solution

  • It's really a matter of preference. Doing everything in a mapping is fine. Doing it inline is fine also. Another option instead of ConvertUsing is to create a custom type converter. You can use one of the built in ones as an example. https://github.com/JoshClose/CsvHelper/blob/master/src/CsvHelper/TypeConversion/BooleanConverter.cs

    GetRecords<T>() returns an IEnumerable<T> that will yield records. That means it will only pull one record on each iteration, so you don't have to worry about all the data being in memory. If you do something like ToList() or Count(), it will pull all the records into memory, so be careful.