Search code examples
javalocaledocx4j

Pdf created using doc4j not able to render locale text properly


I am using docx4j to create pdf files, with docx format the locale language is rendered properly but with pdf # replaced with locale strings.

In document I saw

When docx4j is used to create a PDF, it can only use fonts which are available to it. These fonts come from 2 sources:

->those installed on the computer

->those embedded in the document

Note that Word silently performs font substitution. When you open an existing document in Word,and select text in a particular font, the actual font you see on the screen won't be the font reported in the ribbon if it is not installed on your computer or embedded in the document. To see whether Word 2007 is substituting a font, go into Word Options > Advanced > Show Document Content and press the "Font Substitution" button.

Word's font substitution information is not available to docx4j. As a developer, you 3 options:

->ensure the font is installed or embedded

->tell docx4j which font to use instead, or

->allow docx4j to fallback to a default font

To embed a font in a document, open it in Word on a computer which has the font installed (check no substitution is occuring), and go to Word Options > Save > Embed Fonts in File

But this doesnt seem to work.

Below is my code:

        Mapper fontMapper = new IdentityPlusMapper();

        PhysicalFont font = PhysicalFonts.getPhysicalFonts().get(
                "Comic Sans MS");

        fontMapper.getFontMappings().put("Algerian", font);

        template.setFontMapper(fontMapper);

        PdfSettings pdfSettings = new PdfSettings();

        org.docx4j.convert.out.pdf.PdfConversion conversion = new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(
                template);

        OutputStream out = new FileOutputStream(f1);
        conversion.output(out, pdfSettings);

In above code font is Algerain

Any help will be much appreciated.


Solution

  • Posting this answer because I saw this question raised many times with UTF encoding hope this post helps. this piece of code solved the above problem.

       File f = new File("/path/to/sample.docx");   
       template.save(f);
       File f1 = new File("/path/to/sample.pdf");
       Runtime.getRuntime().exec("doc2pdf " + f);
    

    If sample.docx is our input docx file containing any international language like Chinese etc it will be converted to pdf with same filename and at same path.

    This is because Runtime.getRuntime().exec("doc2pdf " + f); this piece of code runs the terminal command doc2pdf in java program with unbuntu as OS,before this we need to install sudo apt-get install unoconv from terminal this is for doc2pdf command to work.