Search code examples
c#parsingcsvfilehelpers

CSV + FileHelpers + Double Quotes = Nightmare


I can't seem to handle a CSV I got. It's a file generated by a bank, which looks like this:

"000,""PLN"",""XYZ"",""2011-08-31"",""2011-08-31"",""0,00"""
1,""E"",""2011-08-30"",""2011-08-31"",""2011-08-31"",""399,00"",""0000103817846977"",""UZNANIE OTRZYMANE ELIXIR"",""23103015080000000550217023"",""XXX"",""POLISA UBEZPIECZENIA NR XXX  "",""000""
3,""E"",""2011-08-31"",""2011-08-31"",""2011-08-31"",""1433,00"",""0000154450232753"",""UZNANIE OTRZYMANE ELIXIR"",""000"",""XXX"",""POLISA UBEZPIECZENIA XXX  "",""000""

(I changed all sensitive information).

I've been trying to parse it since morning but no biggie. I used the LINQ to CSV example found somwhere on the net, the CodeProject one (both of them threw an error which said that the CSV is corrupted) and I ended with FileHelpers which SEEMS to work BUT:

  1. It splits the "399,00" and similar values into two fields.
  2. When I use the [(FieldQuoted()] attribute it all goes to hell, since all the fields are quoted in DOUBLE quotation marks. I suspect that is the reason why the other parsers wouldn't work.

Any ideas how to handle it?


Solution

  • If the problem seems to be the double quote, you could preprocess each line by substituting the double double quotes by single double quotes:

    line = line.Replace( "\"\"", "\"" );
    

    Once the whole file has been processed, you can let it handled by any other CSV processor. It will be probably easier to write your own, anyway.