I need to read some metadata from a PDF file. I have a code base on itextpdf library which does the job:
static String getPdfFormVersion() throws IOException {
InputStream inputStream = PDFVersionExtractor.class.getResourceAsStream("/test.pdf");
final PdfReader pdfReader = new PdfReader(inputStream);
final byte[] docMetaData = pdfReader.getMetadata();
try (final ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream()) {
byteArrayOutputStream.write(docMetaData);
final String fileXML = byteArrayOutputStream.toString(StandardCharsets.UTF_8.name());
final String versionNode = fileXML.substring(fileXML.indexOf("<desc:version"), fileXML.indexOf("</desc:version>"));
return versionNode;
}
}
the library which I am using is:
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>itextpdf</artifactId>
<version>5.5.13.3</version>
</dependency>
I would like to migrate to the newest version of itext:
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>itext-core</artifactId>
<version>8.0.3</version>
<type>pom</type>
</dependency>
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>kernel</artifactId>
<version>8.0.3</version>
<type>pom</type>
</dependency>
But unfortunately the .getMetaData() method is not available any longer. I have tried to find an equivalent in the PdfReader class but without any success. How can I extract PDF metadata using the newest version of itextpdf?
Thanks to @K J comments I have created a solution which seems to work same way as the one before:
import com.itextpdf.kernel.pdf.*;
import java.io.*;
import java.nio.charset.StandardCharsets;
public class PDFVersionExtractor {
public static void main(String[] args) throws IOException {
String pdfFormVersion = getPdfFormVersion();
System.out.println(pdfFormVersion);
}
static String getPdfFormVersion() throws IOException {
InputStream inputStream = PDFVersionExtractor.class.getResourceAsStream("/test.pdf");
final PdfReader pdfReader = new PdfReader(inputStream);
// create pdf document representation from the reader
PdfDocument pdfDoc = new PdfDocument(pdfReader);
// read xmpMetadata from the document
final byte[] docMetaData = pdfDoc.getXmpMetadata();
try (final ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream()) {
byteArrayOutputStream.write(docMetaData);
final String fileXML = byteArrayOutputStream.toString(StandardCharsets.UTF_8);
return fileXML.substring(fileXML.indexOf("<desc:version"), fileXML.indexOf("</desc:version>"));
}
}
}
What I have done is, I am using PdfDocument
class in order to call the getXmpMetadata
method. The returned byte[] seems to have the same information as the one returned by PdfReader.getMetaData
.