Search code examples
pdfpdf-generationitextpdfbox

How to find whether PDF has landscape orientation or portrait


Are there tools to determine whether a PDF has landscape orientation or portrait?

I have currently looked upon pdfbox and Itext for this but seem that I could not find it. Please tell if they support this.

Extracting the PDF pages information using Origami is providing a information the pdf pages have rotation of some degree. Here is what Origami reports:

{:Parent=>#<PDF::Reader::Reference:0x872349c @id=8, @gen=0>, :Type=>:Page, 
 :Contents=>#<PDF::Reader::Reference:0x8722f24 @id=4, @gen=0>, :Resources=># <PDF::Reader::Reference:0x870dbd8 @id=2, @gen=0>, 
:MediaBox=>[0, 0, 612, 792], :Rotate=>270}

Rotate : 270

What does the 'rotation' actually mean?


Solution

  • The pdfinfo commandline utility has a way to let you see the page size info and MediaBox, CropBox, BleedBox, ArtBox and TrimBox values for each and every page. Here I ask about the values for pages 2 to 4 of a specific document:

    pdfinfo -box -f 2 -l 4 sample.pdf
      Creator:        FrameMaker 6.0
      Producer:       Acrobat Distiller 5.0.5 (Windows)
      CreationDate:   Thu Aug 17 16:43:06 2006
      ModDate:        Tue Aug 22 12:20:24 2006
      Tagged:         no
      Form:           AcroForm
      Pages:          146
      Encrypted:      no
      Page    2 size: 419.535 x 297.644 pts
      Page    2 rot:  90
      Page    3 size: 297.646 x 419.524 pts
      Page    3 rot:  0
      Page    4 size: 297.646 x 419.524 pts
      Page    4 rot:  0
      Page    2 MediaBox:     0.00     0.00   595.00   842.00
      Page    2 CropBox:     87.25   430.36   506.79   728.00
      Page    2 BleedBox:    87.25   430.36   506.79   728.00
      Page    2 TrimBox:     87.25   430.36   506.79   728.00
      Page    2 ArtBox:      87.25   430.36   506.79   728.00
      Page    3 MediaBox:     0.00     0.00   595.00   842.00
      Page    3 CropBox:    148.17   210.76   445.81   630.28
      Page    3 BleedBox:   148.17   210.76   445.81   630.28
      Page    3 TrimBox:    148.17   210.76   445.81   630.28
      Page    3 ArtBox:     148.17   210.76   445.81   630.28
      Page    4 MediaBox:     0.00     0.00   595.00   842.00
      Page    4 CropBox:    148.17   210.76   445.81   630.28
      Page    4 BleedBox:   148.17   210.76   445.81   630.28
      Page    4 TrimBox:    148.17   210.76   445.81   630.28
      Page    4 ArtBox:     148.17   210.76   445.81   630.28
      Page    4 MediaBox:     0.00     0.00   595.00   842.00
      File size:      6888764 bytes
      Optimized:      yes
      PDF version:    1.4
    

    Note the following:

    • *Box values: these are 4 numbers whose units are PostScript points: the first pair represents the coordinates of the lower left corner, the second pair represents coordinates of the upper right corner.

    • MediaBox: Is a required setting for each page inside the PDF.

    • TrimBox: Is an optional setting and defaults to the same as MediaBox if it is not explicitly defined. If it deviates from the MediaBox, then it tells PDF viewers (and printer drivers) to only render and display that particular part of the full page.

    • Page size: This info is derived + computed from the distances that are set up by the TrimBox value.

    • rot: This gives the value of the page rotation. May be 0, 90, 180 or 270 degrees.

    Now, the page's landscape and portrait definitions are this:

    • It is regarded as 'landscape' if the width is greater than the height.
    • It is regarded as 'portrait' if the height is greater than the width.
    • It is undetermined if width and height have the same value.

    But!,....

    • ...you can put a non-zero /Rotation value into your PDF source code (which pdfinfo will show as rot: info) and achieve this way that a 'portrait' PDF page will display as 'landscape' and vice-versa;

    • ...you could define a 'landscape' shaped TrimBox inside a 'portrait' shaped MediaBox or vice versa, as well as mix it with a non-zero rotation, and achieve this way that the 'landscape' shaped content will appear in 'portrait' (or upside-down) look...

    Confused about this? Don't worry, many are. Fact is, 'landscape' and 'portrait' aren't clearly and un-ambiguously defined technical terms. They are just conventions to describe what we see...