Search code examples
javapdfboxacrofields

Get page from found AcroForm form field


I have an existing PDF that I want to open up and add content to the page where a specific PDField (or specifically PDTerminalField, not that I think it matters) is on. It may be on the first page or any later one.
I know the name of the field and with that, I can look it up and could even get the dimensions and the position of it on that page ( DRectangle mediabox = new PDRectangle((COSArray) fieldDict.getDictionaryObject(COSName.RECT));)

However I can't find a way to get the number/index of the page it is on, so I can write on the correct page.

PDAcroForm acroForm = pdfDocument.getDocumentCatalog().getAcroForm();
PDField docField = acroForm.getField("the_coolest_field");

int page = docField.???  // This is the missing part.


PDPageContentStream contentStream = new PDPageContentStream(pdfDocument, 
pdfDocument.getPage(page), PDPageContentStream.AppendMode.APPEND, true);
// now write something on the page where the field is in.

Solution

  • Using the hints given in this comment I could create a map containing the field names and the (last) page it occurred on.

    HashMap<String, Integer> formFieldPages = new HashMap<>();
    for (int page_i = 0; page_i < pdf_document.getNumberOfPages(); page_i++) {
        List<PDAnnotation> annotations = pdf_document.getPage(page_i).getAnnotations(); //
        for (PDAnnotation annotation: annotations) {
            if (!(annotation instanceof PDAnnotationWidget)) {
                System.err.println("Unknown annotation type " + annotation.getClass().getName() + ": " + annotation.toString());
                continue;
            }
            String name = ((PDAnnotationWidget)annotation).getCOSObject().getString(COSName.T);
            if (name == null) {
                System.err.println("Unknown widget name: " + annotation.toString());
                continue;
            }
            // make sure the field does not exists in the map
            if (formFieldPages.containsKey(name)) {
                System.err.println("Duplicated widget name, overwriting previous page value " + formFieldPages.get(name) + " with newly found page " + page_i + ": " + annotation.toString());
            }
            formFieldPages.put(name, page_i);
        }
    }
    

    Now looking up the page is as simple as

    int page = formFieldPages.get(docField.getPartialName());
    

    Note that this may throw a NullPointerException if that widget does not exist for some reason.


    Previous answer below. It seems I was wrong about that approach, but I keep it for reference:

    I have found the /P element which seems like it could be the page:

    int page = (int)currentField.getCOSObject().getCOSObject(COSName.P).getObjectNumber();
    page = page - 5; // I couldn't figure out why it's off by 4, but tests showed that the actual PDF page 1 (index [0]) is represented by `\P {4, 0}`, page 2 ([1]) is called "5", page 3 ([2]) is "6", etc.