Search code examples
.netc++-clilocaleculture

How do I detect the user's locale to get the correct csv separator?


I have a simple data conversion tool and one of the outputs it can produce is a csv file.

This works perfectly here in the UK but when I shipped it out to a German customer I had some issues. Specifally, they use a ',' to represent the decimal point in a floating point number and vice versa. This means that when they open their data file in excel, the result is rather messy to say the least :-)

Substituting the correct character is trivial, but how can I detect whether or not to apply this?

Edit:

So this:

a,b,c
1.1,1.2,1.3
"1.1",1,2,"1,3"
"this,is,multi-
-line",this should be column 2, row 4
a;b;c
"a;b","c"

..looks like this when loaded into excel in the UK:

+----------------+-----+-----+-----+
| a              | b   | c   |     |
+----------------+-----+-----+-----+
| 1.1            | 1.2 | 1.3 |     |
+----------------+-----+-----+-----+
| 1.1            | 1   | 2   | 1,3 |
+----------------+-----+-----+-----+
| this,is,multi- |     |     |     |
| -line          | 2   | 4   |     |
+----------------+-----+-----+-----+
| a;b;c          |     |     |     |
+----------------+-----+-----+-----+
| a;b            | c   |     |     |
+----------------+-----+-----+-----+

..but what happens in Germany?


Solution

  • CSV files as the name suggest should be comma-seperated and are not local dependant. However what you could do to avoid this issue is double-quote the relevant decimal numbers within the CSV file as such: "10,20", "1,50", "This is another column". This should avoid the issue entirely for any decent CSV-parser (such as the FileHelpers library) which will read this as 10,20 and 1,50 and not as: 10, 20, 1, and 50.

    See CSV:

    More sophisticated CSV implementations permit commas and other special characters in a field value. Many implementations use " (double quote) characters around values that contain reserved characters (such as commas, double quotes, or newlines); embedded double quote characters may be represented by a pair of consecutive double quotes