I am reading a CSV file OpenCSV's CSVReaderBuilder which doesn't work as the CSV file for some weird reason I cannot change has some lines with a missing column.
So I thought it would be a good idea to manipulate the BufferedReader I use as input for the CSVReaderBuilder and add an extra column before it is read by CSVReaderBuilder but unfortunately the CSVReaderBuilder will always return null.
This code results in an com.opencsv.exceptions.CsvRequiredFieldEmptyException
as the lines have different number of columns, but works with a proper CSV file:
FileInputStream is;
try {
is = new FileInputStream(fileName);
InputStreamReader isr = new InputStreamReader(is, charSet);
BufferedReader buffReader = new BufferedReader(isr);
// use own CSVParser to set separator
final CSVParser parser = new CSVParserBuilder()
.withSeparator(separator)
.build();
// use own CSVReader make use of own CSVParser
reader = new CSVReaderBuilder(buffReader)
.withCSVParser(parser)
.build();
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
So I added the code to manipulate the BufferedReader to add an extra semicolon if the column count is 13 instead of 14, but this will result in reader
being null.
FileInputStream is;
try {
is = new FileInputStream(fileName);
InputStreamReader isr = new InputStreamReader(is, charSet);
BufferedReader buffReader = new BufferedReader(isr);
buffReader.lines().forEach(t -> {
String a[] = t.split(";");
int occurence = a.length;
if(occurence == 13) {
t = t.concat(";");
}
});
// use own CSVParser to set separator
final CSVParser parser = new CSVParserBuilder()
.withSeparator(separator)
.build();
// use own CSVReader make use of own CSVParser
reader = new CSVReaderBuilder(buffReader)
.withCSVParser(parser)
.build();
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Does anyone have an idea what I'm doing wrong here?
There are a couple of problems here:
First, by the time buffReader
is used in new CSVReaderBuilder(buffReader)
, it has already been fully consumed by buffReader.lines().forEach
. A BufferedReader
can only be read once, in general. A solution could ordinarily be to create a new InputStreamReader
and BufferedReader
on the same file, except in this case, you'll run into the second problem.
The line t = t.concat(";");
does not work the way you expect. All this does is reassign the local variable t
, which isn't used again. It does not change the contents of the file or the contents of the reader.
How to fix this is less straightforward. As far as I know, this exception will only be thrown when binding the CSV data to a bean, and only if fields are marked as required = true
. Given that the source data does not always contain data for the last field, it seems like it should not be marked as required.
If manipulating the source data really is your only option, I can think of a few possible approaches:
StringWriter
, and then construct a StringReader
with the result, and parse that.PipedOutputStream
and PipedInputStream
to connect them.FilterReader
that transforms the file contents as they are read (not the most straightforward to implement).Details of implementing these approaches would too long and broad for this answer, so I would suggest creating follow up questions if needed.
There might be additional options specific to the OpenCSV library that I'm not aware of.