Search code examples
pythonlibreofficesoffice

Converting to pdf using soffice adds blank page


I'm trying to convert an .ods-file to pdf using soffice & python:

import os
import subprocess

def ods_to_pdf(ods_filename):
    file = os.path.join(os.getcwd(), ods_filename)
    path_to_soffice = "path/to/soffice"
    subprocess.run([path_to_soffice, "--headless", "--convert-to", "pdf", file], check=True)

It works fine, but the resulting pdf has a blank page (sometimes two) at the end. Does anyone know how I can prevent this behaviour? The code runs in a Docker container with Ubuntu 18.04 as base image. LibreOffice version: 7.1.0 (I've also tried 6.1.6.3, same result).


Solution

  • I could not find out how to prevent LibreOffice from adding blank pages, but fixed the problem by removing the blank pages after converting:

    import PyPDF2
    
    output = PyPDF2.PdfFileWriter()
        
    input = PyPDF2.PdfFileReader(open("file.pdf", "rb"))
    number_of_pages = input.getNumPages()
    
    for current_page_number in range(number_of_pages):
        page = input.getPage(current_page_number)
        if page.extractText() != "":
            output.addPage(page)
        
    output_stream = open("output.pdf", "wb")
    output.write(output_stream)