Search code examples
javaopencsv

OpenCSV reader strips ending quotes instead of ignoring them


Say I have:

id,description,amount
1,Foo "bar",10.5
2,Quick "brown" fox,9.0

I know that proper csv should have Foo "bar" escaped as Foo ""bar"" to pick up the quotes. But this is the data I have to deal with -- 🤷, and it is not something I can modify before processing.

try (CSVReader csvReader = new CSVReaderBuilder(new FileReader(resourcePath))
        .withSkipLines(1)
        .withCSVParser(new CSVParserBuilder().withIgnoreQuotations(true).build())
        .build()) {
    String[] line;
    ..

To solve this I try to make CSVReader to ignore all quotes with withIgnoreQuotations(true), but it seems to strip the last quotation instead of ignoring it, so the output ends up with

1,Foo "bar,10.5

. Is there any way to achieve below with OpenCSV?

1,Foo "bar",10.5

Solution

  • You can use something like the following:

    new CSVParserBuilder().withQuoteChar('§').build();
    

    Obviously this is not ideal, as you have to choose a character which is guaranteed never to appear in your data. I chose the section symbol §- that may not work for you.

    Just out of interest, the Apache Commons CSV parser does not exhibit this behavior:

    <dependency>
        <groupId>org.apache.commons</groupId>
        <artifactId>commons-csv</artifactId>
        <version>1.8</version>
    </dependency>
    

    And:

    String sampleRecord = "1,Foo \"bar\",10.5";
    CSVParser parser = CSVParser.parse(sampleRecord, CSVFormat.DEFAULT);
            
    for (CSVRecord record : parser) {
        System.out.println(record.get(1));
    }
    

    This prints:

    Foo "bar"