Search code examples
javamavenpdfundefinedpdfbox

Pdfbox why can't I load a document?


So I'm having trouble loading a PDDocument when using PDDocument.load(File). I tried uninstalling and reinstalling the jars for PDFBox to see if the PDDocument object would render properly. Still not working, and I'm very unsure why. The PDDocument object seems to only have its instance methods rather than its documented static methods. I only have a runtime error of NoSuchMethodError regarding PDDocument.load(file).

Here is my code:

package main;

import java.io.File;
import java.io.IOException;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.text.PDFTextStripper;

public class Main {
    public static void main(String[] args) throws IOException {
        File f = new File("C:/Users/user/Desktop/sample/Resume_Michael_Sinclair.pdf");
        PDDocument doc = PDDocument.load(f);
        
        PDFTextStripper tp = new PDFTextStripper();
        System.out.println(tp.getText(doc));
        doc.close();
    }
}

pom.xml is what is used to render the pdfbox objects to my understanding, so here is my pom.xml if that helps:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>Pdfeasy_Editor</groupId>
  <artifactId>Pdfeasy_Editor</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  
  <build>
      <sourceDirectory>src</sourceDirectory>
      <plugins>
         <plugin>
            <artifactId>maven-compiler-plugin</artifactId>
            <version>3.3</version>
            <configuration>
               <source>1.8</source>
               <target>1.8</target>
            </configuration> 
         </plugin>
      </plugins> 
   </build> 
   
   <dependencies>  
      <dependency> 
         <groupId>org.apache.pdfbox</groupId> 
         <artifactId>pdfbox</artifactId> 
         <version>2.0.1</version> 
      </dependency>   
   
      <dependency> 
         <groupId>org.apache.pdfbox</groupId> 
         <artifactId>fontbox</artifactId> 
         <version>2.0.0</version> 
      </dependency>
      
      <dependency>  
         <groupId>org.apache.pdfbox</groupId> 
         <artifactId>jempbox</artifactId> 
         <version>1.8.11</version> 
      </dependency> 
        
      <dependency>
         <groupId>org.apache.pdfbox</groupId> 
         <artifactId>xmpbox</artifactId> 
         <version>2.0.0</version> 
      </dependency> 
     
      <dependency> 
         <groupId>org.apache.pdfbox</groupId> 
         <artifactId>preflight</artifactId> 
         <version>2.0.0</version> 
      </dependency> 
     
      <dependency> 
         <groupId>org.apache.pdfbox</groupId> 
         <artifactId>pdfbox-tools</artifactId> 
         <version>2.0.0</version> 
      </dependency>

   </dependencies>
   
</project>

Any help would be greatly appreciated :).


Solution

  • So the problem was that my pom.xml file had the wrong versions of the libraries. PDFBox 3.0.0 didn't have full functionality for me, so I switched my libraries to the official build, 2.0.23. After setting all versions to 2.0.23, for example:

    <dependency>
        <groupId>org.apache.pdfbox</groupId>
        <artifactId>pdfbox-tools</artifactId>
        <version>2.0.23</version>
    </dependency>
    

    This solved the problem, and the initial syntax I posted now works! Credit to @andrewjames and @mkl