Search code examples
c#.netfilehelpers

Change FileHelpers EOL Character


I'm trying to parse 10GB of .dat files into something recognizable in .NET. The column delimiter is a '~' and the EOL is a '++EOL++'. I know how to handle the delimiter but I can't find an easy way to handle the '++EOL++' when there are no actual line breaks in the file. Can this be handled with an option in FileHelpers or would I have to write something custom?


Solution

  • No FileHelpers does not support files with unusual end-of-lines character sequences by default.

    It would probably be easiest to pre-parse the file and replace the EOL sequences. However, it is an extensible library, so you could create your own DataStorage subclass. You would essentially have to override

    public override object[] ExtractRecords()
    {
        using (MyStreamReader reader = new MyStreamReader(fileName, base.mEncoding, true, 102400))
        {
            T[] localArray = this.ReadStream(reader, maxRecords);
            reader.Close();
            return localArray;
        }
    }
    

    and then create a new class MyStreamReader, which would be identical to the (unfortunately sealed) InternalStreamReader except for ReadLine which contains the EOL code

    switch (ch)
    {
        case '\n':
        case '\r':
    
        etc...
    }
    

    (By the way I'm referring to the source code for FileHelpers 2.9.9. Version 2.0.0 seems to use a System.IO.StreamReader so you can just subclass it instead of duplicating InternalStreamReader.