Search code examples
macospdfwkhtmltopdfpdftk

Any Tricks to Use in wkhtmltopdf and pdftk to Reduce File Size?


I'm using wkhtmltopdf on OS X, and while it has been generally working as intended, the size of the files it generates is larger than I had hoped for. My goal is to essentially save a screenshot of the text content webpage as a pdf, and I don't really care about the images, hyperlinks, and other features on the page. I've been using the tool in conjunction with pdftk to save the first page of a website as a pdf, and below is an example of my code for the desired webpage (http://espn.go.com/mens-college-basketball/boxscore?gameId=400589702):

/usr/local/bin/wkhtmltopdf http://espn.go.com/mens-college-basketball/boxscore?gameId=400589702 --zoom 0.65 /Users/dwm8/Desktop/test.pdf
/usr/local/bin/pdftk /Users/dwm8/Desktop/test.pdf cat 1 output /Users/dwm8/Desktop/test2.pdf dont_ask

The size of the final file test2.pdf is 487 KB, which is larger than I would prefer. Are there any tricks I can use in wkhtmltopdf or pdftk to reduce the file size? Thanks for the help!


Solution

  • Well, if you don't care about hyperlinks or images, the obvious thing to do is suppress them using --disable-external-links and --no-images. If you are really only interested in the text, which is black and white, you may as well only generate a greyscale PDF too:

    /usr/local/bin/wkhtmltopdf --disable-external-links --no-images --zoom 0.65 --grayscale http://espn.go.com/mens-college-basketball/boxscore?gameId=400589702 result.pdf
    

    which gets the file size down from 500kB to 70kB on my system - a fairly useful 86% space saving!