Search code examples
javacsvsupercsv

Messed up CSV leads to Exception


I think i found a bug. Or maybe it isn't, but Super CSV can't handle that well.

I'm parsing a CSV file with 41 Columns with a MapReader. However, i'm getting that CSV - and the Webservice that gives me the CSV messes up one line. The "headline" line is a tab-delimited Row with 41 Cells.

And the "wrong line" is a tab-delimited Row with 36 Cells and the content doesn't make any sense.

This is the code i'm using:


InputStream fis = new FileInputStream(pathToCsv);
InputStreamReader inReader = new InputStreamReader(fis, "ISO-8859-1");

ICsvMapReader mapReader = new CsvMapReader(inReader, new CsvPreference.Builder('"','\t',"\r\n").build());
final String[] headers = mapReader.getHeader(true);
Map<String, String> row;
while( (row = mapReader.read(headers)) != null ) {

    // do something


}

I get an exception when executing mapReader.read(headers) in the row i mentioned above. This is the exception:

org.supercsv.exception.SuperCsvException: 
the nameMapping array and the sourceList should be the same size (nameMapping length = 41, sourceList size = 36)
context=null
at org.supercsv.util.Util.filterListToMap(Util.java:121)
at org.supercsv.io.CsvMapReader.read(CsvMapReader.java:79)
at test.MyClass.readCSV(MyClass.java:20)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)

What do you think i should do ?

I don't want the whole application to crash, just because one row is messed up, i'd rather skip that row.


Solution

  • This is a good question! As a Super CSV developer, I'll look into creating some exception handling examples on the website.

    You could keep it simple and use CsvListReader (which doesn't care how many columns there are), and then just create the Map yourself:

    public class HandlingExceptions {
    
        private static final String INPUT = 
            "name\tage\nTom\t25\nAlice\nJim\t44\nMary\t33\tInvalid";
    
        public static void main(String[] args) throws IOException {
    
            // use CsvListReader (can't be sure there's the correct no. of columns)
            ICsvListReader listReader = new CsvListReader(new StringReader(INPUT), 
                new CsvPreference.Builder('"', '\t', "\r\n").build());
    
            final String[] headers = listReader.getHeader(true);
    
            List<String> row = null;
            while ((row = listReader.read()) != null) {
    
                if (listReader.length() != headers.length) {
                    // skip row with invalid number of columns
                    System.out.println("skipping invalid row: " + row);
                    continue;
                }
    
                // safe to create map now
                Map<String, String> rowMap = new HashMap<String, String>();
                Util.filterListToMap(rowMap, headers, row);
    
                // do something with your map
                System.out.println(rowMap);
            }
            listReader.close();
        }
    }
    

    Output:

    {name=Tom, age=25}
    skipping invalid row: [Alice]
    {name=Jim, age=44}
    skipping invalid row: [Mary, 33, Invalid]
    

    If you were concerned with using Super CSV's Util class (it's possible it could change - it's really an internal utility class), you could combine 2 readers as I've suggested here.

    You could try catching SuperCsvException, but you might end up suppressing more than just an invalid number of columns. The only Super CSV exception I'd recommend catching (though not applicable in your situation as you're not using cell processors) is SuperCsvConstraintViolationException, as it's indicates the file is in the correct format, but the data doesn't satisfy your expected constraints.