I'm using wkhtmltopdf
on OS X, and while it has been generally working as intended, the size of the files it generates is larger than I had hoped for. My goal is to essentially save a screenshot of the text content webpage as a pdf, and I don't really care about the images, hyperlinks, and other features on the page. I've been using the tool in conjunction with pdftk
to save the first page of a website as a pdf, and below is an example of my code for the desired webpage (http://espn.go.com/mens-college-basketball/boxscore?gameId=400589702):
/usr/local/bin/wkhtmltopdf http://espn.go.com/mens-college-basketball/boxscore?gameId=400589702 --zoom 0.65 /Users/dwm8/Desktop/test.pdf
/usr/local/bin/pdftk /Users/dwm8/Desktop/test.pdf cat 1 output /Users/dwm8/Desktop/test2.pdf dont_ask
The size of the final file test2.pdf is 487 KB, which is larger than I would prefer. Are there any tricks I can use in wkhtmltopdf
or pdftk
to reduce the file size? Thanks for the help!
Well, if you don't care about hyperlinks or images, the obvious thing to do is suppress them using --disable-external-links
and --no-images
. If you are really only interested in the text, which is black and white, you may as well only generate a greyscale PDF too:
/usr/local/bin/wkhtmltopdf --disable-external-links --no-images --zoom 0.65 --grayscale http://espn.go.com/mens-college-basketball/boxscore?gameId=400589702 result.pdf
which gets the file size down from 500kB to 70kB on my system - a fairly useful 86% space saving!