Search code examples
javapdffontspdfboxacrobat

Write cyrillic chars into PDF form fields with PDFBox


I am using pdfbox 2.0.5 to fill out form fields of a PDF document using this code:

        doc = PDDocument.load(inputStream);
        PDDocumentCatalog catalog = doc.getDocumentCatalog();
        PDAcroForm form = catalog.getAcroForm();
        for (PDField field : form.getFieldTree()){
            field.setValue("должен");
        }

I get this error: U+0434 ('afii10069') is not available in this font Times-Roman (generic: TimesNewRomanPSMT) encoding: StandardEncoding with differences

The PDF document itself contains cyrillic text which is displayed fine. I have tried using different fonts. For "Arial Unicode MS" it wants to download a 50MB "Adobe Acrobat Reader DC Font Pack". Is this a requirement for cyrillic characters?

Which font do I have to specify in the text field to handle cyrillic (or asian) characters?

Thanks, Ropo


Solution

  • Adobe handles that by reusing the embedded font file in the {/Ubuntu} font and creates a new font resource from that. Here is a quick hack which can serve as a guide of how to achieve something similar. The code is specific to a sample I've got.

    PDDocument doc = PDDocument.load(new File(...));
    PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm();
    PDResources formResources = acroForm.getDefaultResources();
    PDTrueTypeFont font = (PDTrueTypeFont) formResources.getFont(COSName.getPDFName("Ubuntu"));
    
    // here is the 'magic' to reuse the font as a new font resource
    TrueTypeFont ttFont = font.getTrueTypeFont();
    
    PDFont font2 = PDType0Font.load(doc, ttFont, true);
    ttFont.close();
    
    formResources.put(COSName.getPDFName("F0"), font2);
    
    PDTextField formField = (PDTextField) acroForm.getField("Text2");
    formField.setDefaultAppearance("/F0 0 Tf 0 g");
    formField.setValue("öäüинформацию");
    
    doc.save(...);
    doc.close();