I'm in the process of updating our scripts to ensure they remain functional, and discovered iText7 has replaced iTextSharp. My needs are simple; read form fields. Rather, I know how to read a form field, I'm just checking to see if there's a more streamlined way to do it, as it seems like it was easier in iTextSharp.
Here's the old code we're using with iTextSharp (the $form is being fed to the $reader via a foreach loop):
#create pdf reader object and load form
$reader = New-Object iTextSharp.text.pdf.PdfReader -ArgumentList $form.PSPath.Replace("Microsoft.PowerShell.Core\FileSystem::","")
#Get the data I need
$First = $reader.AcroFields.GetField("FirstName")
Simple. When playing with iText7 though, it seems to lose its simplicity. Here's what I have for iText7:
#Create pdf reader and load form
$Reader = [iText.Kernel.Pdf.PdfReader]::new("C:\temp\TestForm.pdf")
#Create PDFDoc object?
$PdfDoc = [iText.Kernel.Pdf.PdfDocument]::new($Reader)
#What? Why?
$Form = [iText.Forms.PdfAcroForm]::getAcroForm($PdfDoc, $True)
#Get the data I need. Oh wait, I am unable to read it.
$fName = $Form.GetField("FirstName")
#Finally...
$First = $fName.GetValue()
I'm afraid I don't have any luck researching simple code; everyone seems to be creating web forms on the fly, or parsing thousands of PDFs for data analytics. I'm also just a lowly SysAdmin, not a dev. Please tell me there's an easier way to read a single form field in iText7. Thanks in advance!
The simplicity is not necessarily measured by the number of lines of code. Your way of reading form fields in iText 7 is correct. The reason you need a couple of more lines is that iText 7 has a much clearer separation of different parts of the code across modules. This has big advantages compared to iText 5 and gives a greater room for flexibility in user code.
Inability to call $Form.GetField("FirstName").GetValue()
is a PowerShell limitation by the way and has nothing to do with iText - you are able to use that kind of chaining in C# or Java.