Search code examples
pngcropghostscript

PDF to PNG using Ghost Script - crop to top third of page


I am trying to convert a PDF file to an image, and in the process crop to the first third(approx) of top of the first page.

This command gives me the whole page and changing the -g option crops toward the bottom left corner if I make the values smaller.

for %%x in (*) do "......\program\gs\gs9.23\bin\gswin32c.exe" -g2500x3300 -dFIXEDMEDIA -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 -sDEVICE=pngalpha -dTextAlphaBits=4 -dGraphicsAlphaBits=4 -r300x300 -dBATCH -dNOPAUSE -dFirstPage=1 -dLastPage=1 -SOutputFile="%%~nx.png" "%%~nx.pdf"

I want the smaller image so that OCR on the image is then faster, and most letters/docs I am dealing with have the information I am after in the top third.


Solution

  • The origin (0, 0) of the PostScript page (and the PDF page) is at the lower left. So by reducing the media size you make the uppermost portion of the content lie off the media and therefore its not rendered.

    So what you need to do is reduce the size of the media (which you have done) **and* translate the origin so that the top of the content lies on the media.

    Try adding -c "<< /BeginPage {-300 0 translate} >> setpagedevice" -f before the input PDF file. That should translate the origin to be 1 inch below the bottom of the media at 300 dpi which should make 1 inch more of the top of the page, and 1 inch less of the bottom of the page render.

    Obviously since I don't know how large your content is, I can't give you an exact answer.