My requirement is that there is a .xls file within a zip file, which can be downloaded using an URL. As I want to read this excel file in memory (for later processing), without downloading the zip locally, I have used ZipInputStream and this is how the main part of my code looks like:
String finalUrl = "https://server/myZip.zip"
URL url = new URL(finalUrl);
InputStream inputStream = new BufferedInputStream(url.openStream());
ZipInputStream zis = new ZipInputStream(inputStream);
ZipEntry file;
try {
while ((file = zis.getNextEntry()) != null) {
if (file.getName().endsWith(".xls")) {
log.info("xls file found");
log.info("file name : {}", file.getName());
byte excelBytes[] = new byte[(int)file.getSize()];
zis.read(excelBytes);
InputStream excelInputStream = new ByteArrayInputStream(excelBytes);
HSSFWorkbook wb = new HSSFWorkbook(excelInputStream);
HSSFSheet sheet = wb.getSheetAt(8);
log.info("sheet : {}", sheet.getSheetName());
}
else {
log.info("xls file not found");
}
}
}
finally{
zis.close();
}
But unfortunately I am receiving the following error:
java.lang.ArrayIndexOutOfBoundsException: Index -3 out of bounds for length 3247
Note: The .xls file is around 2MB and the zip file does not have any complex structure such as sub-directories or multiple files.
Any help here would be highly appreciated. Thanks!
Thanks to @PJ Fanning for highlighting this,
The problem was in zis.read(excelBytes)
which does not guarantee to read all the bytes. After using IOUtils.toByteArray
instead, the problem was resolved. The correct code is:
String finalUrl = "https://server/myZip.zip"
URL url = new URL(finalUrl);
InputStream inputStream = new BufferedInputStream(url.openStream());
ZipInputStream zis = new ZipInputStream(inputStream);
ZipEntry file;
try {
while ((file = zis.getNextEntry()) != null) {
if (file.getName().endsWith(".xls")) {
log.info("xls file found");
log.info("file name : {}", file.getName());
byte excelBytes[] = IOUtils.toByteArray(zis);
InputStream excelInputStream = new ByteArrayInputStream(excelBytes);
HSSFWorkbook wb = new HSSFWorkbook(excelInputStream);
HSSFSheet sheet = wb.getSheetAt(8);
log.info("sheet : {}", sheet.getSheetName());
}
else {
log.info("xls file not found");
}
}
}
finally{
zis.close();
}