Search code examples
pdf-generationwkhtmltopdfexport-to-pdflibreoffice-writer

Create small high quality PDF embedding optimized PNG?


I'm trying to create a small PDF file, embedding one optimized PNG image displayed as a header and footer on a 3 page PDF (same image must appear 6x in the PDF)

My optimized PNG image is only 2.3KB. It looks very sharp.

Failed with libreoffice

When I insert just one instance of the 2.3KB PNG image into a Libreoffice Writer doc containing only text, then export as PDF I can see that the image gets re-compressed to JPG and the resulting PDF file grows by about 40KB after adding the image. It also loses quality, the PNG also gets JPG fuzzy edges.

If I right click the image and select compression, there is no way to disable recompressing the image (it's already optimized better than libreoffice could do it) I've tried setting a compression level of 0,1,9 etc. Choosing JPG, no resize, lossless, etc but there was no improvement.

Failed with wkhtmltopdf

I also tried making a test page and used wkhtml2pdf but it did the same thing. Adding the low quality flag made no difference.

PDF Spec suggests PNG is supported?

From skimming the PDF spec, it looks like PNG images are supported.

Even plain text PDF files are surprisingly large

The disappointing thing is also when I take a 7KB HTML file which is basically just <html><body><p>foo...</p><p>bar...</p> (only about 15 paragraphs) with no CSS. The resulting 2 page PDF file is 30KB. Why should a 7kb (almost plain text) file become 30kb as a PDF?

Suggestions?

Can someone please suggest how to make a small PDF file in Linux? I need to include 7KB of text and repeat one PNG image 6 times.

Manually or programatically. I'll take whatever I can get at this point.


Solution

  • PDF Spec suggests PNG is supported?

    PNG isn't supported per se; PDF allows embedding JPEG images as-is, but not PNG images. PDF does borrow a set of features of the PNG format, however.

    rinohtype (full disclosure: I'm the author) tries to embed as much as possible from PNG images as-is into the PDF. This does involve some bit-juggling to separate the alpha channel from the color data for example, but no reencoding of the image is performed. It does not (yet) support interlaced PNGs.

    rinohtype should be able to do what you want to achieve. But please note that it currently is in a beta stage, so you might encounter some bugs.

    Even plain text PDF files are surprisingly large

    To keep the PDF size as small as possible, make sure not to embed/subset any of the fonts. Use only the fonts from the base 14 PDF fonts which are provided by PDF readers.