Search code examples
javascripthtmlgoogle-apps-scriptgmail

How can I extract inline images from a Gmail email? (all available workarounds do not work anymore)


I programmed a script to export my received emails from Gmail to PDF. Everything works fine except I cannot extract the inline images which have been sent to me using a Gmail account.

The problem is, that there is no link between the inline image attachment (they all get the same name if inserted per copy and paste e.g. graphic.png) and the cid (e.g. ii_l5vcyjv50) in the html code.

There seem to be hundreds of workaround solutions on the internet to extract the image's base64 code from message.getRawContent() and replace the corresponding cid in the html with it. But all do not work anymore because either the structure of getRawContent() has changed and the used RegEx do not work or Google changed how inline images are linked (format of cid).

Is it really that difficult to get access to the inline images? How could I extract the X-Attachment-Ids and the belonging base64 codes into an array?

Unfortunately there are no tags used in front of the base64 code and the RegEx absolutely overwhelms me.

Thank you very much!

...
Dies ist ein Screenshot:
[image: grafik.png]

Dies ist eine per Drag und Drop eingef=C3=BCgte Bilddatei:
[image: amsel.jpg]

Dies ist ein Bild aus Word:
[image: grafik.png]

Dies ist ein Bild direkt aus dem Browser =C3=BCber "Grafik kopieren"
[image: grafik.png]

--=20
Viele Gr=C3=BC=C3=9Fe
Benni

--0000000000008333b505e48ad3d1
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=
=3D"gmail_attr">---------- Forwarded message ---------<br>Von: <b class=3D"=
gmail_sendername" dir=3D"auto">Summer Moon</b> <span dir=3D"auto">&lt;<a hr=
ef=3D"mailto:[email protected]">[email protected]</a>&gt;</span=
><br>Date: Do., 21. Juli 2022 um 20:24=C2=A0Uhr<br>Subject: Gmail Inline Im=
age<br>To: Summer Moon &lt;<a href=3D"mailto:[email protected]">summe=
[email protected]</a>&gt;<br></div><br><br><div dir=3D"ltr"><div>Dies ist =
ein Screenshot:</div><div><img src=3D"cid:ii_l5vd00311" alt=3D"grafik.png" =
width=3D"468" height=3D"263"><br><br></div><div>Dies ist eine per Drag und =
Drop eingef=C3=BCgte Bilddatei:</div><div><img src=3D"cid:ii_l5vd1fs22" alt=
=3D"amsel.jpg" width=3D"468" height=3D"222"><br><br></div><div><br></div><d=
iv>Dies ist ein Bild aus Word:</div><div><img src=3D"cid:ii_l5vd44ms3" alt=
=3D"grafik.png" width=3D"468" height=3D"468"><br><br></div><div><br></div><=
div>Dies ist ein Bild direkt aus dem Browser =C3=BCber &quot;Grafik kopiere=
n&quot;<br></div><div><img src=3D"cid:ii_l5vcyjv50" alt=3D"grafik.png" widt=
h=3D"468" height=3D"222"><br><br>-- <br><div dir=3D"ltr" data-smartmail=3D"=
gmail_signature"><div dir=3D"ltr">Viele Gr=C3=BC=C3=9Fe=C2=A0<div>Benni</di=
v></div></div></div></div>
</div></div>

--0000000000008333b505e48ad3d1--
--0000000000008333b605e48ad3d2
Content-Type: image/png; name="grafik.png"
Content-Disposition: inline; filename="grafik.png"
Content-Transfer-Encoding: base64
Content-ID: <ii_l5vcyjv50>
X-Attachment-Id: ii_l5vcyjv50

iVBORw0KGgoAAAANSUhEUgAAB4AAAAOPCAYAAAAqu9wAAAAABHNCSVQICAgIfAhkiAAAIABJREFU
eF7svdmXJOd53vnmXllbV29oNIAGAZAUaZImJYqmRGksjzSWfSEf6VbnzPHV3M7dXMyl+WdoLnx0
xvacIx/bc6GjsWXNcEYSTJEUKYIUuADEjsbS6LX23DPmeSMruqOiMjIyKzMrt182ClUZ8S3v9/u+
iKzKJ5/3ywVBJwiCnPlXNzB9t9ijq5/z8QP8DAEIQAACEIBAjEAul4bj1AtqWqGpHc+nBza1Pgc1
fDac2fIZFCvnIJBFQL82j/nw37HTH7lgvN+/AxscoH7zT+88PDO4vv5iGFg/yI07vjH7T0zQ2fvP
wPAzT57+e+lJ8WH78b+5Bj3yGcNPqx7FdSqO3NnSfQ6Fcx7Vj+YvlxhQtG6y1ld8bP37elLiXNdS
VqWMTjNOD5qa8NyY3We2P26BrPiy2s/mM3iB5mIXSDyWYePq1/+wdX1suZP7V5Bxn8ricN7z8f77
jaVfuz6+fnEPWz8c98mlHrFKYxZvM61MFOMo/aeNq9/xcY5FnFLbSLtBn1SY9ZizmCbvu6njPOeJ
s68IpxsafHWfs1OqQQACEIAABCAAAQjMjEBxZj3TMQQgAAEIQAACEIAABFaSwHgC7/wjW/bxnZ2B
YcXfszU5MjKBLAVl5AapAAEIQAACEIAABCAAAQhAAAIQWD4CCMDLN6eMCAIQgAAEIAABCEAAAhAQ
gWkIs95mhslsIPtpxDSww1FPRg70M0JrJOxnechG7ZDyy0Qgy2G5TGOd9lhCZ3Dicov4ph2PYvLz
zMW0Z4j2IQABCEAAAhCAAAQgMN8EEIDne36IDgIQgAAEINCXACna+mLhIATmgkDyjfmLDmrW94eZ
j/8CAIwj4k47vFPtn0OrfTy2hPo0q5S6o14/s15/o8Y76fJZ4x9fFDzHotI
gs+KaNIdlaa+fCOxj´´´
SzseH/cyisBZ98+s1Tnv6zAr/qx1ncUnqz7nIQABCEAAAhCAAASWi

Solution

  • I believe your goal is as follows.

    • You want to retrieve the image data of the inline image file from a Gmail message.
    • You want to achieve this using Google Apps Script.

    In this case, how about the following sample script?

    Sample script:

    This script uses Gmail API. So, please enable Gmail API at Advanced Google services. And, please set the message ID of the email including the inline images.

    function myFunction() {
      var messageId = "###"; // Please set the message ID of mail you want to retrieve.
    
      var obj = Gmail.Users.Messages.get("me", messageId);
      var files = obj.payload.parts.reduce((ar, { body, headers, mimeType }) => {
        var ob = headers.find(({ name }) => name == "X-Attachment-Id");
        if (ob) {
          var { data } = Gmail.Users.Messages.Attachments.get("me", messageId, body.attachmentId);
          var blob = Utilities.newBlob(data, mimeType, ob.value);
          ar.push({ cid: ob.value, blob });
        }
        return ar;
      }, []);
    
      // If you want to create the files as each file, you can use the following script.
      files.forEach(({ blob }) => DriveApp.createFile(blob)); // By this, the files with the filename of "cid" are created to the root folder.
    }
    
    • When This script is run, first, the attachment IDs are retrieved from the email including the inline images using Gmail.Users.Messages.get. And, using the attachment IDs, the image data is retrieved using Gmail.Users.Messages.Attachments.get. In the case of Gmail API of Advanced Google services, the data is returned as the byte array. And, the value of files has an array including the cid value and the image blob.

    References: