Search code examples
javapdfbox

Convert PDF to JPEG


I'm trying to create a process of converting base64 of pdf to base64 of jpeg image. The problem that I have is if pdf is scanned with less than 600dpi, image is not rendered correctly, it is white image without any content in it.

This is my code:

public String convertPDFtoJPEG(byte[] pdfData) {
    try {
      PDDocument document = PDDocument.load(pdfData);
      PDFRenderer renderer = new PDFRenderer(document);

      for (int pageIndex = 0; pageIndex < document.getNumberOfPages(); pageIndex++) {
        BufferedImage image = renderer.renderImageWithDPI(pageIndex, 300);
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        ImageIO.write(image, "JPEG", baos);

        baos.flush();

        String base64JPEG = Base64.getEncoder().encodeToString(baos.toByteArray());

        baos.close();
      }

      document.close();
    } catch (IOException e) {
      e.printStackTrace();
    }
    return base64JPEG;

  }

 public byte[] readBase64FromFile(String filePath) {
    File file = new File(filePath);
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    try (BufferedReader reader = new BufferedReader(new FileReader(file))) {
      String line;
      while ((line = reader.readLine()) != null) {
        baos.write(line.getBytes());
      }
    } catch (IOException e) {
      e.printStackTrace();
    } 
    return Base64.getDecoder().decode(baos.toByteArray());
  }

This is how I'm calling this methods:

String filepath = "path:\\to\\file.txt"
byte[] pdfData = readBase64FromFile(filePath);

convertPDFtoJPEG(pdfData);

In file.txt is base64 of pdf.


Solution

  • As discussed in the comments, make sure you have the dependencies for JPX and JBIG2 image formats:

    <dependency>
        <groupId>com.github.jai-imageio</groupId>
        <artifactId>jai-imageio-jpeg2000</artifactId>
        <version>1.4.0</version>
    </dependency>
    <dependency>
        <groupId>com.github.jai-imageio</groupId>
        <artifactId>jai-imageio-core</artifactId>
        <version>1.4.0</version>
    </dependency>
    

    I also recommend using the twelvemonkeys library for TIFF and JPEG.