I want to replace content controls(drop down list only) in a docx
with actual text and then applying some logic on document to extract out tables using apache-poi
. If I don't do it then cells having content control are not extracted.
If, I save my docx
manually as Word 97-2003(*.doc
) then it asks to removes all content controls and replace it with text being selected so I am planning to convert docx
to doc
to get rid of content controls.
I've explored so far:
I came across Aspose.words
library but it is paid one and can do a
job in just 3 lines of code(tested with trial version).
I tried POI
itself but did not understand how to do it exactly. I tried below code:
XWPFDocument doc = new XWPFDocument(new FileInputStream("<DOCX_FILE_PATH>"));`
FileOutputStream fos = new FileOutputStream("<PATH_FOR_DOC_FILE>");
doc.write(fos);
fos.close();
It does create doc file but did not remove content controls as it did with aspose
.
JODConverter
because it relies on LibreOffice
or OpenOffice
- We don't have it on server and don't have permission to install new softwares.Docx4J
but looks like it can't do it after checking its API.what would be a best way to handle this scenario, is there any way to replace content controls directly? Thanks!
docx4j can remove content controls
The essence of the sample code at https://github.com/plutext/docx4j/blob/master/docx4j-samples-docx4j/src/main/java/org/docx4j/samples/ContentControlRemove.java reproduced below:
String input_DOCX = System.getProperty("user.dir") + "/some.docx";
// resulting docx
String OUTPUT_DOCX = System.getProperty("user.dir") + "/OUT_ContentControlRemove.docx";
// Load input_template.docx
WordprocessingMLPackage wordMLPackage = Docx4J.load(new File(input_DOCX));
// There is no xml stream
FileInputStream xmlStream = null;
Docx4J.bind(wordMLPackage, xmlStream, Docx4J.FLAG_BIND_REMOVE_SDT);
//Save the document
Docx4J.save(wordMLPackage, new File(OUTPUT_DOCX), Docx4J.FLAG_NONE);