Search code examples
javamultithreadingpdfimportpdfbox

Reading multiple PDF files in order


I split PDF files into multiple pdf files and then I am trying to read multiple pdf files from the folder and print out file names.

int l=1;
File file = new File(userInputFile);
try (PDDocument document = PDDocument.load(file)) {

    Splitter splitter = new Splitter();
    List<PDDocument> Pages = splitter.split(document);
    Iterator<PDDocument> iterator = Pages.listIterator();

    while (iterator.hasNext()) {
        PDDocument pd = iterator.next();    
        pd.save("C:\\Users\\Public\\Documents\\FolderForCheckListTest_000\\"+"Page "+l++);
    }

    document.close();
}

Files in folder: Page 1, Page 2. Page 3, Page 4, Page 5, Page 6, Page 7, Page 8, Page 9 and Page 10.

When I read these files and print them I get incorrect order: Page 1, Page 10, Page 2 and so on.

Here is my code for reading files:

 for (File ListOfFile : ListOfFiles) {
    if (ListOfFile.isFile()) {
        files  = ListOfFile.getName();
        if (files.startsWith("Page")){
            000\\multiplePDFtest\\";
            String nfiles = path;
            PDFManager pdfManager = new PDFManager();
            String pdfToText = pdfManager.pdftoText(nfiles+files);
            listStrings.add(pdfToText);
        }
    }
}

Do you know how to fix it? Thank you in advance :)


Solution

  • To create fixed filename lengths, change this

    pd.save("C:\\Users\\Public\\Documents\\FolderForCheckListTest_000\\"+"Page "+l++);
    

    to this

    pd.save("C:\\Users\\Public\\Documents\\FolderForCheckListTest_000\\"+"Page "+String.format("%02d",l++));
    

    (for clarity, I recommend to put the "++" outside, but that's another story)