Search code examples
pdfgoogle-apps-scriptgmailpdf-generation

Images not reproduced when converting a Blob from HTML to PDF


I want to convert HTML emails into a PDF. I have written the following piece of code.

      var txt = msgs[i].getBody();
      /* We need two blob conversions - one from text to HTML and the other from HTML to PDF */
      var blob = Utilities.newBlob(txt, 'text/html',"Test PDF");
      Logger.log(txt);
      var tempDoc = DocsList.createFile(blob);
      var pdf = tempDoc.getAs('application/pdf');
      pdf.setName('Email As PDF');
      DocsList.createFile(pdf);

The above piece of code first creates a Blob out of the HTML from a Gmail message and uses the getAs() function to convert it to a PDF. However, images in the HTML are not to be found in the PDF. Any ideas on how to get these images would be appreciated. Any alternative ideas on how to convert a gmail message to PDF is also welcome.


Solution

  • Interesting problem. Makes sense as to why this doesn't work - PDF conversion doesn't bother "rendering" the HTML to go fetch the image src.

    I did a quick test and confirmed that Data URI's (inline images without requiring a separate HTTP call) worked with images.

    So, one hacky solution could be go fetch the images and then convert them to Data URI. This has a few downsides - hard to find these images (regex would be fragile or not comprehensive), lots of UrlFetch calls (even with some caching, most automated email senders add trackers so that you end up re-fetching the same image) and slow.

    Convert -

    <img src="http://images.myserver.com/myimage.png..."/>

    To (you can check the content type dynamically as well)-

    <img src="..."/>