I have a simple data conversion tool and one of the outputs it can produce is a csv file.
This works perfectly here in the UK but when I shipped it out to a German customer I had some issues. Specifally, they use a ',
' to represent the decimal point in a floating point number and vice versa. This means that when they open their data file in excel, the result is rather messy to say the least :-)
Substituting the correct character is trivial, but how can I detect whether or not to apply this?
Edit:
So this:
a,b,c
1.1,1.2,1.3
"1.1",1,2,"1,3"
"this,is,multi-
-line",this should be column 2, row 4
a;b;c
"a;b","c"
..looks like this when loaded into excel in the UK:
+----------------+-----+-----+-----+
| a | b | c | |
+----------------+-----+-----+-----+
| 1.1 | 1.2 | 1.3 | |
+----------------+-----+-----+-----+
| 1.1 | 1 | 2 | 1,3 |
+----------------+-----+-----+-----+
| this,is,multi- | | | |
| -line | 2 | 4 | |
+----------------+-----+-----+-----+
| a;b;c | | | |
+----------------+-----+-----+-----+
| a;b | c | | |
+----------------+-----+-----+-----+
..but what happens in Germany?
CSV files as the name suggest should be comma-seperated and are not local dependant. However what you could do to avoid this issue is double-quote the relevant decimal numbers within the CSV file as such: "10,20", "1,50", "This is another column"
. This should avoid the issue entirely for any decent CSV-parser (such as the FileHelpers library) which will read this as 10,20
and 1,50
and not as: 10
, 20
, 1
, and 50
.
See CSV:
More sophisticated CSV implementations permit commas and other special characters in a field value. Many implementations use " (double quote) characters around values that contain reserved characters (such as commas, double quotes, or newlines); embedded double quote characters may be represented by a pair of consecutive double quotes