Search code examples

PDFBox v3.0.0 StackOverflowError when splitting PDF file

I have a Java class Split that is responsible for splitting PDF files into multiple parts based on page ranges. The class uses PDFBox for this purpose. Additionally, I have a PDFModel class to manage the resulting PDF files and a Range class to specify page ranges.

Here's the Split class:

public class Split{
    private Logger logger;
    private File inputFile;
    private PDFModel pdfModel;
    private File outputDirectory;

    public Split(Logger logger, File inputFile, File outputDirectory) {
        // Constructor logic...

     * Splits a PDF file based on a list of page ranges and saves the resulting partial PDFs.
     * @param ranges A list of page ranges specifying which pages to split from the input PDF.
     * @return An ArrayList of PDFModel objects representing the resulting partial PDFs.
    public ArrayList<PDFModel> splitByRanges(ArrayList<Range> ranges){
        ArrayList<PDFModel> results = new ArrayList<>();
        for (int i = 0; i < ranges.size(); i++) {
            PDDocument partial = split(ranges.get(i));
            if(partial == null) {
            File outputFile = new File(Paths.get(outputDirectory.getAbsolutePath(), "file_" + i + ".pdf").toString());
            try {
                results.add(new PDFModel(outputFile, partial));
      , "Successfully splitted '" + inputFile + "' from page " + ranges.get(i).getFrom() + " to " + ranges.get(i).getTo() + " into '" + outputFile.getAbsolutePath() + "'");  
            } catch (IOException e) {
        return results;

    private PDDocument split(Range range) {
        PDDocument result = new PDDocument();

        int fromPage = range.getFrom();
        int toPage = range.getTo();
         // Get the PDPageTree from the PDDocument
        PDPageTree pdPageTree = pdfModel.getPDDocument().getPages();
        if (fromPage <= 0 || toPage <= 0 || fromPage > toPage || toPage > pdPageTree.getCount()) {
            logger.warning(this, "Invalid page range for splitting.");
            return null;
        for (int i = range.getFrom() -1; i < range.getTo(); i++) {
        return result;


The org.apache.pdfbox.multipdf.Splitter does the same but doesn't work either.

private PDDocument split(Range range) {
    int fromPage = range.getFrom();
    int toPage = range.getTo() ;

    PDDocument pddocument = pdfModel.getPDDocument();

    Splitter splitter = new Splitter();

    splitter.setSplitAtPage(toPage - fromPage +1 );

    List<PDDocument> lst = null;
    try {
        lst = splitter.split(pddocument);
    } catch (IOException e) {

    return lst.get(0);

The PDFModel class:

public class PDFModel {
    private File file;
    private PDDocument pdDocument;
    private ArrayList<PDFImage> images;
    private ArrayList<String> pages;

    public PDFModel(File file, PDDocument pdDocument) {
        // Constructor logic...

The Range class:

public class Range {
    private int from;
    private int to;

    public Range(int from, int to) {
        // Constructor logic...

I'm trying to use this Split class to split a PDF file into multiple parts using the following code:

This throws an error:

Splitter splitter = new Splitter(logger, inputFile, outputDirectory);
splitter.splitByRanges(new ArrayList<Range>(Arrays.asList(new Range(1, 7), new Range(8, 9), new Range(10, 11)));

And this works perfectly fine (not for org.apache.pdfbox.multipdf.Splitter):

Splitter splitter = new Splitter(logger, inputFile, outputDirectory);
splitter.splitByRanges(new ArrayList<Range>(Arrays.asList(new Range(1, 8), new Range(10, 12), new Range(14, 16)));

However, I'm encountering the following StackOverflowError:

Exception in thread "main" java.lang.StackOverflowError
    at java.base/java.util.HashMap.tableSizeFor(
    at java.base/java.util.HashMap.<init>(
    at java.base/java.util.LinkedHashMap.<init>(
    at java.base/java.util.HashSet.<init>(
    at java.base/java.util.LinkedHashSet.<init>(
    at org.apache.pdfbox.util.SmallMap.entrySet(
    at org.apache.pdfbox.cos.COSDictionary.entrySet(
    at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(
    at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(
    at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(
    at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(
    at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSArray(
    at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(


How can I resolve this StackOverflow error?


  • Solution 1

    The problem seems to be on pdfbox's side so here just a workaround for version 3.0.0

    private PDDocument split(Range range) {
        PDDocument pdDocument = new PDDocument();
        for (PDPage pdPage : pdfModel.getPDDocument().getPages()) {
        int fromPage = range.getFrom();
        int toPage = range.getTo();
        int pageCount = pdDocument.getNumberOfPages();
        if (fromPage > 0 && toPage > 0 && pageCount >= fromPage && pageCount < toPage) {
            logger.warning(this, "Invalid page range for splitting.");
            return null;
        System.out.println("Page count: " + pdDocument.getNumberOfPages());
        for (int n = pageCount - 1; n >= toPage; n--) {
        for (int n = fromPage -2; n >= 0; n--) {
        return pdDocument;

    Solution 2

    Use org.apache.pdfbox version > 3.0.0 or a later version 3.0.1 and above hopfully this issue resolves the bug.