Search code examples
javacsvapache-commonsapache-commons-csv

Apache CSV Quote Character Does Not Work For Multiple Columns


I read a very simple CSV file like this:

String csv = "'ID', 'fruit'\n'1', 'apple'\n'2', 'banana'\n'3', 'cherry'";

try (InputStream resourceInputStream = new ByteArrayInputStream(csv.getBytes());
    InputStreamReader inputStreamReader = new InputStreamReader(resourceInputStream);) {

  CSVFormat format = CSVFormat.DEFAULT.withDelimiter(',').withHeader()
          .withSkipHeaderRecord(false).withRecordSeparator("\n").withTrim().withQuote('\'');
  CSVParser parser = format.parse(inputStreamReader);
  Iterator<CSVRecord> iterator = parser.iterator();

  while (iterator.hasNext()) {
    CSVRecord next = iterator.next();
    System.out.println(next.toMap());
  }
}

This prints the following to the console:

{ID=1, 'fruit'='apple'}
{ID=2, 'fruit'='banana'}
{ID=3, 'fruit'='cherry'}

While I of course expect:

{ID=1, fruit=apple}
{ID=2, fruit=banana}
{ID=3, fruit=cherry}

And it's not purely cosmetic either, if there is a separator inside the quotes it's used as if the quotes were not present. (So using "che,rry" will put "rry" in the third column.)

It does not work with " instead of ' either. It does not work with the default quote (which should be ", too). It does not work with withQuoteMode(). It does not work for the previous Apache CSV version (the current is 1.8, I tested 1.7 and 1.6).

Does anyone have any idea what I need to do to make quotes work in the second and following columns?

Nevermind: It's working with withIgnoreSurroundingSpaces()


Solution

  • The spaces in header and value in the CSV text seem to confuse commons-csv, with the following string the output looks different:

    Input:

    String csv = "'ID','fruit'\n" +
        "'1','apple'\n" +
        "'2','banana'\n" +
        "'3','cherry'";
    

    Output:

    {ID=1, fruit=apple}
    {ID=2, fruit=banana}
    {ID=3, fruit=cherry}