Search code examples
javaexport-to-csvprintwriter

Words not separeted by commas when writing to CSV


Code successfully writes to CSV but writes every line in one cell.

For example:

hămăit, hămăit, SUBST, 98226, 98226, null, ron, n
hămăit, hămăit, SUBST, 2, 98226, nom-sg, ron, n

The example above is divided correctly into lines, but each line is just one cell, not separated by commas.

This is how the text looks before being written to the csv:

[hămăit, hămăit, SUBST, 98226, 98226, null, ron, {{n}}, hămăit, hămăit, SUBST, 2, 98226, nom-sg, ron, {{n}}, ...

Below is my code.

public String escapeSpecialCharacters(String data) {
   String escapedData = data.replaceAll("\\R", " ");
   escapedData = escapedData.replaceAll("\\{", "");
   escapedData = escapedData.replaceAll("}", "");
   return escapedData;
}

public String convertToCSV(WordDictObj data) {
   String[] strings = data.returnStringArray();
   return Stream.of(strings)
      .map(this::escapeSpecialCharacters)
      .collect(Collectors.joining(","));
}

public void givenDataArray_whenConvertToCSV_thenOutputCreated() throws IOException {
   OutputStream csvOutputFile = new FileOutputStream("parseOutPut.csv");
   try (PrintWriter pw = (new PrintWriter(csvOutputFile, true, StandardCharsets.UTF_8))) {
      dataLines.stream()
         .map(this::convertToCSV)
         .forEach(pw::println);

   }
}

I checked and it works fine both in google sheets and when viewed inside my IDE as a CSV (and I select comma as value separator). Apparently it's just in excel that it's not working. Also in excel it doesn't display special characters.

How it looks in google sheet: google sheet

How it looks in notepad++: notepad

How it looks in excel: excel


Solution

  • In the notepad++ screenshot you attached, the lines look completely fine and, in fact, have a comma separating the fields. Also, google sheets seems to work just fine as well. Therefore, it's most likely an issue with excel.

    As pointed out by g00se in the comments, it looks like you have to specify the encoding.