Search code examples
pdftron

PDFTron: batch update attributes


I want to batch update the titles of all objects in a PDF. Is there a way for me to do this? I had in mind to iterate through the fields and change the T values, but this appears not to work; changes to the fields don't persist from one iteration to the next, much less appear in the saved output file:

PDFNet.initialize();
var doc = new PDFDoc(infile.getAbsolutePath)
var iter = doc.fdfExtract().getFieldIterator

while (iter.hasNext) {
  var field = iter.next
  var obj = field.findAttribute("T")
  if (obj != null && obj.isString) {
      obj.setString("new title")
      println(field.getName) // Outputs "new title"
  }
}

iter = doc.fdfExtract().getFieldIterator
while (iter.hasNext) {
  var field = iter.next
  var obj = field.findAttribute("T")
  if (obj != null && obj.isString) {
      println(field.getName) // Outputs the original title
  }
}

doc.save(new FileOutputStream("out.pdf"), SDFDoc.SaveMode.INCREMENTAL, null)
doc.close

Here's a decompressed, toy pdf on which I've experimented (uploaded as a text file). It has only one input.


Solution

  • The issue is that you are calling fdfExtract() which exports (makes a copy) of the fields and returns them as a FDFDoc, so you are editing a temporary object. Which is why later when you call fdfExtract() you are getting the same original data, since you never edited the original PDFDoc.

    If your intention is to edit the FDFDoc then keep the reference. FDFDoc fdfdoc = pdfdoc.fdfExtract();

    If your intention is to edit the PDF itself, then erase your fdfExtract calls and instead call pdfdoc.getFieldIterator()