Search code examples
pythonlinuxpdf-generationpdftkimposition

Reverse PDF imposition


I have an imposed document: there are 4 × n A4 pages on the n sheets. I put them into a roller image scanner and receive one 2 × n paged PDF document (A3).

If, say, n = 3, then I've got the following sequence of A3 pages in my PDF:

  • page one: page 12 (on the left) and page 1 of the original document
  • page two: p.2 and p.11 of the original document
  • page three: p.10 and p.3
  • … and so on until…
  • page six: p.6 and p.7 of the original document

Question: how can I reconstruct the original sequence of pages in one PDF file of the A4 format? I.e. I want to do this:

--A3--         --A4--
[12| 1]         [1]
[ 2|11]         [2]
[10| 3]    ⇒    [3]
   …             … 
[ 6| 7]         [6]
                [7]
                 … 
                [12]

In linux I usually use pdftk or pdftops-like console utilities for this kind of cases, but I cannot figure out how to use them for my current purpose.


Solution

  • After a while I found this thread and tuned the code a bit:

    import copy
    import sys
    import math
    import pyPdf
    
    def split_pages(src, dst):
        src_f = file(src, 'r+b')
        dst_f = file(dst, 'w+b')
    
        input_PDF = pyPdf.PdfFileReader(src_f)
        num_pages = input_PDF.getNumPages()
    
        first_half, second_half = [], []
    
        for i in range(num_pages):
            p = input_PDF.getPage(i)
            q = copy.copy(p)
            q.mediaBox = copy.copy(p.mediaBox)
    
            x1, x2 = p.mediaBox.lowerLeft
            x3, x4 = p.mediaBox.upperRight
    
            x1, x2 = math.floor(x1), math.floor(x2)
            x3, x4 = math.floor(x3), math.floor(x4)
            x5, x6 = math.floor(x3/2), math.floor(x4/2)
    
            if x3 > x4:
                # horizontal
                p.mediaBox.upperRight = (x5, x4)
                p.mediaBox.lowerLeft = (x1, x2)
    
                q.mediaBox.upperRight = (x3, x4)
                q.mediaBox.lowerLeft = (x5, x2)
            else:
                # vertical
                p.mediaBox.upperRight = (x3, x4)
                p.mediaBox.lowerLeft = (x1, x6)
    
                q.mediaBox.upperRight = (x3, x6)
                q.mediaBox.lowerLeft = (x1, x2)
    
    
            if i in range(1,num_pages+1,2):
                first_half += [p]
                second_half += [q]
            else:
                first_half += [q]
                second_half += [p]
    
        output = pyPdf.PdfFileWriter()
        for page in first_half + second_half[::-1]:
            output.addPage(page)
    
        output.write(dst_f)
        src_f.close()
        dst_f.close()
    
    if len(sys.argv) < 3:
        print("\nusage:\n$ python reverse_impose.py input.pdf output.pdf")
        sys.exit()
    
    input_file = sys.argv[1]
    output_file = sys.argv[2]
    
    split_pages(input_file,output_file)
    

    See this gist.