Search code examples
pythonpdfacrobatpdf-formpdfrw

Python - How to properly fill a multiline text field in PDF form using pdfrw?


I'm filling a PDF form using python with pdfrw. I have no problem with any single line text field in the form. But when I try to fill a multi-line textfield it doesn`t render properly, it ignores break lines.

This is part of my code:

pdf.Root.AcroForm.update(PdfDict(NeedAppearances=PdfObject('true')))

for x in range(0, len(pdf.Root.AcroForm.Fields)):
    try:
        if pdf.Root.AcroForm.Fields[x].T in ['(Observaciones)']:
            pdf.Root.AcroForm.Fields[x].update(PdfDict(V='This\nis\nmultiline', Ff=1))
            continue

This is the output.

enter image description here

This is the settings in the form field using Adobe Acrobat.

enter image description here

I have selected the options: Multiline, Scroll Long Text, Allow Rich Text Formatting.

I have tried using \r and the <br> tag too.

How should I set the value to render properly?


Solution

  • After struggling with flattening my fields (setting to read-only) while keeping the multiline intact I finally found the solution.

    I came across this rushed solution to just turn the ‘/Ff’ value to 1 but in doing so you run the chance of removing some of the formatting for the form, and here is a better explanation on how to manipulate this field.

    In the ‘PDF Reference: third edition’ page:552 it states

    “The value of the field dictionary’s Ff entry is an unsigned 32-bit integer containing flags specifying various characteristics of the field. Bit positions within the flag word are numbered from 1 (low-order) to 32 (high-order).“

    So, when I looked into the Ff field I got 4096 (in binary = 0001-0000-0000-0000) for allot of the forms I was struggling with. Turns out that the 13th bit controls the multiline setting and just setting the Ff to 1 therefor erases the rest of the settings and only dictates that the specific form should be read-only.

    You could just do it the lazy way and give Ff the value 4097 (or whatever your Ff value is +1) or flip the bit you want in the byte string that you want to manipulate.

    Here is a simple way to do it, that is modifiable to your needs.

    bitPosition = 0 #First position being value 0
    bitValue = 1
    mask = 1 << bitPosition
    oldFfValue = pdf.Root.AcroForm.Fields[x].Ff
    newFfValue = (int(oldFfValue) & ~mask) | ((1 << 0) & mask)
    pdf.Root.AcroForm.Fields[x].Ff.update(PdfDict(Ff=newFfValue))
    

    ps. Not 100% sure if accessing the Ff value works this way for you since my approach was based on this persons example code since it fit my needs better.

    All in all I recommend giving the PDF reference document a look if you want to mess with the rest of the bits to see the interesting characteristics they control.

    Other usefull bits are for example:

    • 14th bit is for password star typing.
    • 23rd bit is for disabling spell-check.
    • 24th bit is for disabling scrolling of the field.