I am converting DocX to Html and back to DocX. The final Docx is successfully generated. However, the conversion skewed the formatting of the table in the final document. The table generated in the final docx has its cell width lengthened, causing the table to go out of boundary of the document.
Is there a way for me to keep the same format after conversion? Any advice is greatly appreciated.
Below is my code:
private void convertHtmlToDocx() throws IOException, JAXBException, Docx4JException{
//convert back to docx
String inputfilepath = System.getProperty("user.dir") + "myPath";
String baseURL = "file:///"+System.getProperty("user.dir")+"path";
String stringFromFile = FileUtils.readFileToString(new File(inputfilepath), "UTF-8");
String unescaped = stringFromFile;
if (stringFromFile.contains("</") ) {
unescaped = StringEscapeUtils.unescapeHtml(stringFromFile);
}
System.out.println("Unescaped: " + unescaped);
// Setup font mapping
RFonts rfonts = Context.getWmlObjectFactory().createRFonts();
rfonts.setAscii("Century Gothic");
XHTMLImporterImpl.addFontMapping("Century Gothic", rfonts);
// Create an empty docx package
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage();
NumberingDefinitionsPart ndp = new NumberingDefinitionsPart();
wordMLPackage.getMainDocumentPart().addTargetPart(ndp);
ndp.unmarshalDefaultNumbering();
// Convert the XHTML, and add it into the empty docx we made
XHTMLImporterImpl XHTMLImporter = new XHTMLImporterImpl(wordMLPackage);
XHTMLImporter.setTableFormatting(FormattingOption.IGNORE_CLASS);
XHTMLImporter.setParagraphFormatting(FormattingOption.IGNORE_CLASS);
XHTMLImporter.setHyperlinkStyle("Hyperlink");
wordMLPackage.getMainDocumentPart().getContent().addAll(XHTMLImporter.convert(unescaped, baseURL) );
wordMLPackage.save(new java.io.File(System.getProperty("user.dir") + "myPath") );
}
Is your use case web-based editing via XHTML roundtrip?
If so, maybe docx-html-editor helps. It works by saving state/hints which are used in the round trip process.
Aside from this, tables in Word are either fixed cell widths, or not. Is the behaviour you describe occuring with a fixed width table, or not?
Fixed width should be ok (or easy enough to make so). Not fixed is harder...