I am facing an issue where I cannot load even the sample word2003xml.xml which is provided by doc4J for tests in docx4j-samples-docx4j-8.3.1.zip found here https://www.docx4java.org/downloads.html
I tried loading the file using 2 different constructors but the result is the same.
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new FileInputStream(new File("C:\\Mine\\project4tests\\word2003xml.xml")));
WordprocessingMLPackage wordMLPackage2 = WordprocessingMLPackage.load(new java.io.File("C:\\Mine\\project4tests\\word2003xml.xml"));
Here is the exception that I am getting:
Exception in thread "main" org.docx4j.openpackaging.exceptions.Docx4JException: Couldn't load xml from stream
at org.docx4j.openpackaging.packages.OpcPackage.load(OpcPackage.java:641)
at org.docx4j.openpackaging.packages.OpcPackage.load(OpcPackage.java:418)
at org.docx4j.openpackaging.packages.OpcPackage.load(OpcPackage.java:376)
at org.docx4j.openpackaging.packages.OpcPackage.load(OpcPackage.java:341)
at org.docx4j.openpackaging.packages.WordprocessingMLPackage.load(WordprocessingMLPackage.java:182)
at Main.main(Main.java:13)
Caused by: javax.xml.bind.UnmarshalException
with linked exception:
[com.sun.istack.internal.SAXParseException2; lineNumber: 3; columnNumber: 827; unexpected element (uri:"http://schemas.microsoft.com/office/word/2003/wordml", local:"wordDocument"). Expected elements are <{http://schemas.microsoft.com/office/2006/xmlPackage}package>,<{http://schemas.microsoft.com/office/2006/xmlPackage}part>,<{http://schemas.microsoft.com/office/2006/xmlPackage}xmlData>]
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.handleStreamException(UnmarshallerImpl.java:468)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:402)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(UnmarshallerImpl.java:371)
at org.docx4j.convert.in.FlatOpcXmlImporter.<init>(FlatOpcXmlImporter.java:132)
at org.docx4j.openpackaging.packages.OpcPackage.load(OpcPackage.java:638)
... 5 more
Caused by: com.sun.istack.internal.SAXParseException2; lineNumber: 3; columnNumber: 827; unexpected element (uri:"http://schemas.microsoft.com/office/word/2003/wordml", local:"wordDocument"). Expected elements are <{http://schemas.microsoft.com/office/2006/xmlPackage}package>,<{http://schemas.microsoft.com/office/2006/xmlPackage}part>,<{http://schemas.microsoft.com/office/2006/xmlPackage}xmlData>
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext.handleEvent(UnmarshallingContext.java:726)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.Loader.reportError(Loader.java:247)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.Loader.reportError(Loader.java:242)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.Loader.reportUnexpectedChildElement(Loader.java:109)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext$DefaultRootLoader.childElement(UnmarshallingContext.java:1131)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext._startElement(UnmarshallingContext.java:556)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext.startElement(UnmarshallingContext.java:538)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.InterningXmlVisitor.startElement(InterningXmlVisitor.java:60)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.StAXStreamConnector.handleStartElement(StAXStreamConnector.java:231)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.StAXStreamConnector.bridge(StAXStreamConnector.java:165)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:400)
... 8 more
Caused by: javax.xml.bind.UnmarshalException: unexpected element (uri:"http://schemas.microsoft.com/office/word/2003/wordml", local:"wordDocument"). Expected elements are <{http://schemas.microsoft.com/office/2006/xmlPackage}package>,<{http://schemas.microsoft.com/office/2006/xmlPackage}part>,<{http://schemas.microsoft.com/office/2006/xmlPackage}xmlData>
... 19 more
There is no issue loading a .DOCX file, however what I need to use the docx4J library is to convert an old .DOC (WordprocessingML more like an .XML) file into a .DOCX. Similar to what is done here https://coderanch.com/t/721499/java/Word-XML-DOCX
Does anybody know why I cannot load the file properly?
See https://github.com/plutext/docx4j/blob/master/docx4j-core/src/main/java/org/docx4j/convert/in/word2003xml/Word2003XmlConverter.java for 2003 XML files.
Note that .doc is the old binary format; its not XML, it is something different again.