based on this SO question i tried reading through every single page in a pdf file. The background to this is, that i am trying to replace pages that do not contain any textcontent but do contain images with completly blank pages. The background for this is that the pdf can contain blank pages who might contain images. These pages do need to be there because they are about to beeing printed with duplex.
But with PDFBox 2.0 this seems to be a bit more complicated since i am running into a stacktrace everytime i am trying to save the freshly generated PDDocument
. Should this be done any different with the new Version of PDFBox 2.0
? Should i avoid closing the PDDocument buffer
, because by leaving it out the sample programm does run without exception and what could be potentional side effects of this?
a simple running example can be seen here. You can use any pdf file, since the result will be a pdf file with the same amount of pages whom should be empty:
public static void main(String[] args) throws IOException {
// Load a simple pdf file
PDDocument d = PDDocument.load(new File("D:\\test.pdf"));
// This should be our new output pdf
PDDocument c = new PDDocument();
for(int i = 0;i<d.getNumberOfPages();++i) {
// From the SO question, create a new PDDocument and just add the single page
PDDocument buffer = new PDDocument();
PDPage page = d.getPage(i);
buffer.addPage(page);
// Here i´d check if it has content but gonna leave it out now
// Reassign the page variable to generate a "blank" pdf
page = new PDPage();
// In order to let some printers not ignore the blank page I have to
// write white text on the white background.
PDPageContentStream contentStream = new PDPageContentStream(buffer, page);
PDFont font = PDType1Font.HELVETICA_BOLD;
contentStream.beginText();
contentStream.setNonStrokingColor(Color.white); // !!!!!!
contentStream.setFont( font, 6 );
contentStream.newLineAtOffset(100, 700);
contentStream.showText("Empty page");
contentStream.endText();
contentStream.close();
// Close the buffer document, if i comment it out the exception is gone
buffer.close();
// Add the blank page
c.addPage(page);
}
d.close();
// The exception occurs here and seems to be connected with the closing of the buffer document
c.save("D:\\newtest.pdf");
c.close();
}
The Stacktrace:
Exception in thread "main" java.io.IOException: Scratch file already closed
at org.apache.pdfbox.io.ScratchFile.checkClosed(ScratchFile.java:390)
at org.apache.pdfbox.io.ScratchFileBuffer.checkClosed(ScratchFileBuffer.java:99)
at org.apache.pdfbox.io.ScratchFileBuffer.seek(ScratchFileBuffer.java:295)
at org.apache.pdfbox.io.RandomAccessInputStream.restorePosition(RandomAccessInputStream.java:47)
at org.apache.pdfbox.io.RandomAccessInputStream.read(RandomAccessInputStream.java:78)
at java.io.InputStream.read(InputStream.java:101)
at org.apache.pdfbox.io.IOUtils.copy(IOUtils.java:66)
at org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1134)
at org.apache.pdfbox.cos.COSStream.accept(COSStream.java:372)
at org.apache.pdfbox.pdfwriter.COSWriter.doWriteObject(COSWriter.java:533)
at org.apache.pdfbox.pdfwriter.COSWriter.doWriteBody(COSWriter.java:450)
at org.apache.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:1034)
at org.apache.pdfbox.cos.COSDocument.accept(COSDocument.java:409)
at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1284)
at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1185)
at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1110)
at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1082)
at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1070)
at pdftools.Test.main(Test.java:41)
Your code is somewhat confusing, but the core of the problem is that in 2.0 you should not close documents if you are using their pages in another document.
So here are some solutions: