I have a application that converts html files to DocX using DocX4J. I´m having problems with special characters like ç,á,é,í,ã,etc. My text font in the html files is Arial but when I convert them to DocX the special characters mentioned before are set to calibri font. So, in the same word (e.g Cláudio), I have "Cl" written in Arial font, "á" character in Calibri font and "udio" in Arial font.
I saw that maybe I have to set font property in w:r but I´m having difficulty to see how to do it to all runs of my text been converted. Also, I can´t see how to do it in my conversion code, that is listed below (with a sample html).
Any tip or suggestion about how to do this conversion and handle those special characters would be really great.
Cheers.
public WordprocessingMLPackage export(String xhtml) {
WordprocessingMLPackage wordMLPackage = null;
try {
wordMLPackage = WordprocessingMLPackage.createPackage();
XHTMLImporter importer = new XHTMLImporterImpl(wordMLPackage);
List<Object> content = importer.convert(xhtml,null);
wordMLPackage.getMainDocumentPart().getContent().addAll(content);
}
catch (Docx4JException e) {
// ...
}
return wordMLPackage;
}
<html>
<head>
<meta charset="ISO-8859-1" />
<style type="text/css">
h1 {
page-break-before: always;
}
p, h1 {
font-family: Arial;
font-size: 12pt;
}
p {
line-height: 150%;
}
h1 {
font-weight: bold;
line-height: 130%
}
</style>
</head>
<body>
<h1>RESUMO<br /></h1>
<p>
<span>Um resumo para o relatório.</span><br />
</p>
</body>
</html>
Following the tip given by JasonPlutext, I found an example of how to map a font to the XHTMLImporter at the DocX4J forum (http://www.docx4java.org/forums/docx-java-f6/docx-to-html-and-back-to-docx-t1913.html).
Now my code is working! See the final version below.
public WordprocessingMLPackage export(String xhtml) {
WordprocessingMLPackage wordMLPackage = null;
try {
RFonts arialRFonts = Context.getWmlObjectFactory().createRFonts();
arialRFonts.setAscii("Arial");
arialRFonts.setHAnsi("Arial");
XHTMLImporterImpl.addFontMapping("Arial", arialRFonts);
wordMLPackage = WordprocessingMLPackage.createPackage();
XHTMLImporter importer = new XHTMLImporterImpl(wordMLPackage);
List<Object> content = importer.convert(xhtml,null);
wordMLPackage.getMainDocumentPart().getContent().addAll(content);
}
catch (Docx4JException e) {
// ...
}
return wordMLPackage;
}