I have the issue that Apache POI "corrupted" a xlsm / xlsx file by just reading and writing it (e.g. with the following code)
public class Snippet {
public static void main(String[] args) throws Exception {
String str1 = "c:/tmp/spreadsheet.xlsm";
String str2 = "c:/tmp/spreadsheet_poi.xlsm";
// open file
XSSFWorkbook wb = new XSSFWorkbook(new FileInputStream(new File(str1)));
// save file
FileOutputStream out = new FileOutputStream(str2);
wb.write(out);
wb.close();
out.close();
}
}
Once you open the spreadsheet_poi.xlsm in Excel you'll get an error like the following
"We found a problem with some content in xxx. Do you want us to try to recover as much as we can..."?
If you say yes you'll end up with a log which could look like this:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<recoveryLog xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
<logFileName>error145040_01.xml</logFileName>
<summary>Errors were detected in file 'C:\tmp\spreadsheet_poi.xlsm'</summary>
<repairedParts>
<repairedPart>Repaired Part: /xl/worksheets/sheet4.xml part with XML error. Load error. Line 2, column 0.</repairedPart>
<repairedPart>Repaired Part: /xl/worksheets/sheet5.xml part with XML error. Load error. Line 2, column 0.</repairedPart>
<repairedPart>Repaired Part: /xl/worksheets/sheet8.xml part with XML error. Load error. Line 2, column 0.</repairedPart>
</repairedParts>
</recoveryLog>
Whats the best approach to debug the issue in more detail (e.g. find out what makes poi to "corrupt" the file?
Eventually I found how that the best approach for debugging this are two things
OOXMLPrettyPrint
(https://svn.apache.org/repos/asf/poi/trunk/src/ooxml/java/org/apache/poi/ooxml/dev/) to format the file and then reopen it it in excel.In my case the problem was that poi somehow changed the dimension setting from
<dimension ref="A1:XFD147"/>
to
<dimension ref="A1:XFE147"/>
(with XFE beeing a non existing column). I fixed it by removing those many empty columns in the original xlsx file.