Search code examples
pdfcommand-linejpeg

Batch conversion of jpg to pdf


I have huge amount of jpeg files each being a photho of a page of from a historical document. Now I want to (batch) create pdf files out of these, preferably making those files representing one document into separate pdf files, with the pages in the correct order. Filenames are constructed like this "date y p id optional.jpg" where y is the running number if several documents have the same date, p is the page number, id is the number of the photo from the camera and finally optional sometimes is present and contains optional info on the document. All pieces are separated by a space. I was hoping to find a possibility to use the built in Microsoft PDF writer, but have not found a comand line interface for that. I can of course make a script from the directory listing, only that I know the command line interface of the application to make the script for. A bonus would be if each page of the created pdf file could contain parts of the filename.


Solution

  • If you aren't against a python script there is an image to pdf library known as img2pdf. The PyPi link can be found here and I would be happy to do up a quick script for you

    EDIT: A tutorial can be found here

    EDIT 2: This should do

    ## Import libraries ##
    import img2pdf, os
    from Pillow import Image
    # sets an empty list var to store the dir
    dirofjpgs = "PUT DIRECTORY HERE" # formatting is C:\\User not C:\User\
    pathforpdfs = "PUT DIRECTORY HERE"
    # change dir to working dir
    os.chdir(dirofjpgs)
    NameOfFiles = []
    # sets and empty list to store the names of files
    ExtOfFiles = []
    # sets and empty list to store the names and extentions of files
    self_file = os.path.basename(__file__)
    for i in range(1, len(os.listdir(os.curdir))): # for every item in the current dir
        if (os.path.splitext((os.listdir(os.curdir))[i])[1]) != ".ini": # if the item ends doesnt end in .ini which is a windows file
            NameOfFiles.append(os.path.splitext((os.listdir(os.curdir))[i])[0])  # adds the Name of the file into the NameOfFiles list
            ExtOfFiles.append(os.path.splitext((os.listdir(os.curdir))[i])[1]) # adds the Name and Extention of the file into the ExtOfFiles list
    
    # for every item in the nameoffiles list
    for i in range(len(NameOfFiles)):
        # open image with pillow
        image = Image.open(NameOfFiles[i], ExtOfFiles[i])
        # convert with img2pdf
        pdf_values = img2pdf.convert(image.filename)
        # save as pdf in dir
        file = open(pathforpdfs, "wb")
        file.write(pdf_values)
        #close
        image.close()
        file.close()
        print(str(i+1), "/", len(NameOfFiles))
    

    mg2pdf-module/