Search code examples
c#.netpdfitextitext7

How to Read PieceInfo Per Page Metadata From a PDF


I am trying to use iText7 to read PieceInfo metadata from a PDF in a C#.NET application. You can see the metadata from the PDF file itself below (Metadata name have been changed for confidentiality purposes, but the structure remains the same). Each page contains this metadata in a PieceInfo object. I know that it is possible to read this data because it is viewable in PDFGears as shown here. If this is not possible in iText7, then is there another framework that would let me read it?

/PieceInfo
<<
/MDPS:MetaDataDefaultSet
<<
/LastModified(D:20210108183543Z)
/Private
<<
/ID(2A)
/Name1(John Doe)
/Code(9889023470)
/BagBundleG(P00024 0003** -R00064)
/ID(3A)
/Name2(Jane Doe)
/Code(21344143)
/BagBundleO(P0002L 0000** -R00037)
>>
>>
/MDPS:ControlSet
<<
/LastModified(D:20210108183543Z)
/Private
<<
/1
<<
/Dat1_Rel
<<
/Type(Rel)
/Value(01005A2)
/Enable(False)
>>
>>
/2
<<
/Dat2_Rel
<<
/Type(Rel)
/Value(02005A2)
/Enable(False)
>>
>>
/3
<<
/Dat3_Rel
<<
/Type(Rel)
/Value(03005A2)
/Enable(False)
>>
>>
/4
<<
/Dat4_Rel
<<
/Type(Rel)
/Value(04005A2)
/Enable(False)
>>
>>
/5
<<
/Dat5_Rel
<<
/Type(Rel)
/Value(05005A2)
/Enable(False)

I have tried the following but pagedict is always null.

PdfReader reader = new PdfReader("PMI1040_01A_B_R001.PDF");
PdfWriter writer = new PdfWriter("PMI1040_01A_B_R002.PDF");
PdfDocument pdfDoc = new PdfDocument(reader, writer);

PdfDictionary pagedict;
for (int i = 1; i < pdfDoc.GetNumberOfPages(); i++)
{
    pagedict = pdfDoc.GetPage(i).GetPdfObject().GetAsDictionary(new 
    PdfName("PieceInfo"));
    MessageBox.Show((pagedict.Get(new PdfName("PieceInfo")).ToString()));
}
reader.Close();

Any help would be appreciated.


Solution

  • Your variable pagedict actually is not the page dictionary but already the PieceInfo dictionary:

    pagedict = pdfDoc.GetPage(i).GetPdfObject().GetAsDictionary(new PdfName("PieceInfo"));
    

    Thus, when you thereafter do pagedict.Get(new PdfName("PieceInfo")), you actually look for a PieceInfo entry inside the PieceInfo dictionary. Which does not exist. And therefore returns null.

    So simply drop one Get*(new PdfName("PieceInfo")) to fix your issue.