Search code examples
csvquotes

csv parsing, quotes as characters


I have a csv file that contains fields such as this:

""#33CCFF"

I would imagine that should be the text value:

"#33CCFF

But both excel and open office calc will display:

#33CCFF"

What rule of csv am I missing?


Solution

  • When Excel parses the value, it does not first remove the outer quotes, to then proceed reading what's in between them. Even if it would, what would it do with the remaining " in front of #? It can not show this as you expect "#33CCFF, because for it to appear like that, the double quote should have been escaped by duplicating it. (That might be the 'csv' rule you are missing.)

    Excel reads the value from left to right, interpreting "" as a single ", it then reads on, and finds an unexpected double quote at the end, 'panics' and simply displays it.

    The person/application creating the csv file made the mistake of adding encapsulation without escaping the encapsulation character within the values. Now your data is malformed. Since when using encapsulation, the value should be """#33CCFF", and when not using encapsulation, it should be "#33CCFF.

    This might clarify some things