Search code examples
javaapache-poidocx4j

Removing content controls from Docx


I want to replace content controls(drop down list only) in a docx with actual text and then applying some logic on document to extract out tables using apache-poi. If I don't do it then cells having content control are not extracted. If, I save my docx manually as Word 97-2003(*.doc) then it asks to removes all content controls and replace it with text being selected so I am planning to convert docx to doc to get rid of content controls. I've explored so far:

  • I came across Aspose.words library but it is paid one and can do a job in just 3 lines of code(tested with trial version).

  • I tried POI itself but did not understand how to do it exactly. I tried below code:

     XWPFDocument doc = new XWPFDocument(new FileInputStream("<DOCX_FILE_PATH>"));`
     FileOutputStream fos = new FileOutputStream("<PATH_FOR_DOC_FILE>");
     doc.write(fos);
     fos.close();
    

It does create doc file but did not remove content controls as it did with aspose.

  • I am restraining for now to try JODConverter because it relies on LibreOffice or OpenOffice- We don't have it on server and don't have permission to install new softwares.
  • I looked into Docx4J but looks like it can't do it after checking its API.

what would be a best way to handle this scenario, is there any way to replace content controls directly? Thanks!


Solution

  • docx4j can remove content controls

    The essence of the sample code at https://github.com/plutext/docx4j/blob/master/docx4j-samples-docx4j/src/main/java/org/docx4j/samples/ContentControlRemove.java reproduced below:

        String input_DOCX = System.getProperty("user.dir") + "/some.docx";
    
        // resulting docx
        String OUTPUT_DOCX = System.getProperty("user.dir") + "/OUT_ContentControlRemove.docx";
    
        // Load input_template.docx
        WordprocessingMLPackage wordMLPackage = Docx4J.load(new File(input_DOCX));
    
        // There is no xml stream
        FileInputStream xmlStream = null;
    
        Docx4J.bind(wordMLPackage, xmlStream, Docx4J.FLAG_BIND_REMOVE_SDT);
    
        //Save the document 
        Docx4J.save(wordMLPackage, new File(OUTPUT_DOCX), Docx4J.FLAG_NONE);