Search code examples

Can't set RTL direction for Hebrew letters while converting from *.xhtml to *.pdf by using iText library

I'm trying to convert *.xhtml with Hebrew characters (UTF-8) to PDF by using iText library but I getting all letter in reverse order. As far I understand from this question I can set RTL only for ColumnText and PdfCell objects:

Arabic (and Hebrew) can only be rendered correctly in the context of ColumnText and PdfPCell.

So I doubt is it possible to convert whole *.xhtml page to PDF?

This is an *.xhtml file which I try to import:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

<html xmlns="">

  <title>Title of document</title>

<body style="font-size:12.0pt; font-family:Arial">
  שלום עולם


And this is Java code which I use:

public static void convert() throws Exception{
            Document document = new Document();
            PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream("import.pdf"));

            String str = null;
            BufferedReader in = new BufferedReader(new InputStreamReader(new FileInputStream("import.xhtml"), "UTF8"));
            StringBuilder sb = new StringBuilder();

            while ((str = in.readLine()) != null) {

            XMLWorkerHelper worker = XMLWorkerHelper.getInstance();

            InputStream is = new ByteArrayInputStream(sb.toString().getBytes(StandardCharsets.UTF_8));
            worker.parseXHtml(writer, document, is, Charset.forName("UTF-8"));


This is what I get until now:

And this is result which I get

Thank you for any help.


  • Please take a look at the ParseHtml10 example. In this example, we have take the file hebrew.html:

      <title>Hebrew text</title>
    <body style="font-size:12.0pt; font-family:Arial">
    <div dir="rtl" style="font-family: Noto Sans Hebrew">שלום עולם</div>

    And we convert it to PDF using this code:

    public void createPdf(String file) throws IOException, DocumentException {
        // step 1
        Document document = new Document();
        // step 2
        PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(file));
        // step 3;
        // step 4
        // Styles
        CSSResolver cssResolver = new StyleAttrCSSResolver();
        XMLWorkerFontProvider fontProvider = new XMLWorkerFontProvider(XMLWorkerFontProvider.DONTLOOKFORFONTS);
        CssAppliers cssAppliers = new CssAppliersImpl(fontProvider);
        HtmlPipelineContext htmlContext = new HtmlPipelineContext(cssAppliers);
        // Pipelines
        PdfWriterPipeline pdf = new PdfWriterPipeline(document, writer);
        HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
        CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
        // XML Worker
        XMLWorker worker = new XMLWorker(css, true);
        XMLParser p = new XMLParser(worker);
        p.parse(new FileInputStream(HTML), Charset.forName("UTF-8"));;
        // step 5

    The result looks like hebrew.pdf:

    enter image description here

    What are the hurdles you need to take?

    • You need to wrap your text in an element such as a <div> or a <td>.
    • You need to add the attribute dir="rtl" to define the direction.
    • You need to make sure that you're using a font that knows how to display Hebrew. I used a NOTO font for Hebrew. This is one of the fonts distributed by Google in their program to provide fonts for every possible language.

    I can't read Hebrew, but I hope the resulting PDF is correct and that this solves your problem.

    Important: this solution requires at least iText and XML Worker 5.5.5, because support for the dir attribute was introduced in 5.5.4 and improved in 5.5.5.