I found here an example that traverses an existing docx file and prints its raw XML on the standard output. I would like to transform such example into a piece of code that copies the document, while traversing it, into a new file instead of simply printing it to the standard output. My goal is eventually to copy it with some prescribed additions of text.
I don't know exactly how to modify the snippet below in order to recreate the elements in a new WordprocessingMLPackage while they are encountered in the original one.
new TraversalUtil(body,
new Callback() {
String indent = "";
@Override
public List<Object> apply(Object o) {
String wrapped = "";
if (o instanceof JAXBElement)
wrapped = " (wrapped in JAXBElement)";
o = XmlUtils.unwrap(o);
String text = "";
if (o instanceof org.docx4j.wml.Text)
text = ((org.docx4j.wml.Text) o).getValue();
System.out.println(indent + o.getClass().getName() + wrapped + " \""
+ text + "\"");
return null;
}
// other code
} // end of Callback(){ ... }
);
I also tried another approach: modify the raw XML unzipping the docx and manipulating the file "word/document.xml". When I zip back the unzipped folder and rename it to docx, MS Word cannot open it.
Copying objects is easy; you can use XmlUtils.deepCopy: https://github.com/plutext/docx4j/blob/master/docx4j-core/src/main/java/org/docx4j/XmlUtils.java#L1022
BUT lots of bits of WordML have implicit or explicit formal relationships to other parts of the XML file which you need to manage to get the results you expect. See further https://www.docx4java.org/blog/2010/11/merging-word-documents/
For example, if the object references an image, you'll need to include that. If a paragraph references a style which is missing, it will go unstyled. etc etc