Search code examples
phpgmail-api

Cannot get the body of email with Gmail PHP API


I'm having trouble with the Gmail PHP API.

I want to retrieve the body content of emails, but I can retrieve it only for emails which have attachments! My question is why?

Here's my code so far:

// Authentication things above...
$client = getClient();
$gmail = new Google_Service_Gmail($client);    
$list = $gmail->users_messages->listUsersMessages('me', ['maxResults' => 1000]);

while ($list->getMessages() != null) {   
    foreach ($list->getMessages() as $mlist) {               
        $message_id = $mlist->id;   
        $optParamsGet2['format'] = 'full';
        $single_message = $gmail->users_messages->get('me', $message_id, $optParamsGet2);

        $threadId = $single_message->getThreadId();
        $payload = $single_message->getPayload();
        $headers = $payload->getHeaders();
        $parts = $payload->getParts();
        //print_r($parts); PRINTS SOMETHING ONLY IF I HAVE ATTACHMENTS...
        $body = $parts[0]['body'];
        $rawData = $body->data;
        $sanitizedData = strtr($rawData,'-_', '+/');
        $decodedMessage = base64_decode($sanitizedData); //should display my body content
    }

    if ($list->getNextPageToken() != null) {
        $pageToken = $list->getNextPageToken();
        $list = $gmail->users_messages->listUsersMessages('me', ['pageToken' => $pageToken, 'maxResults' => 1000]);
    } else {
        break;
    }
}

The second option to retrieve content that I know is by using the snippet located in the Headers part, but it only retrieves the 50 first characters or so, which isn't very useful.


Solution

  • Let's do a little experiment. I've sent two messages to myself. One with an attachment, and one without.

    Request:

    GET https://www.googleapis.com/gmail/v1/users/me/messages?maxResults=2
    

    Response:

    {
     "messages": [
      {
       "id": "14fe21fd6b3fb46f",
       "threadId": "14fe21fd6b3fb46f"
      },
      {
       "id": "14fe21f9341ed73c",
       "threadId": "14fe21f9341ed73c"
      }
     ],
     "nextPageToken": "08943597140129624594",
     "resultSizeEstimate": 3
    }
    

    I only ask for the payload, since that is where all the relevant parts are:

    fields = payload
    
    GET https://www.googleapis.com/gmail/v1/users/me/messages/14fe21fd6b3fb46f?fields=payload
    
    GET https://www.googleapis.com/gmail/v1/users/me/messages/14fe21f9341ed73c?fields=payload
    

    Mail without attachment:

    {
     "payload": {
      "parts": [
       {
        "partId": "0",
        "mimeType": "text/plain",
        "filename": "",
        "headers": [
         {
          "name": "Content-Type",
          "value": "text/plain; charset=UTF-8"
         }
        ],
        "body": {
         "size": 22,
         "data": "aGVjaz8gTm8gYXR0YWNobWVudD8NCg=="
        }
       },
       {
        "partId": "1",
        "mimeType": "text/html",
        "filename": "",
        "headers": [
         {
          "name": "Content-Type",
          "value": "text/html; charset=UTF-8"
         }
        ],
        "body": {
         "size": 43,
         "data": "PGRpdiBkaXI9Imx0ciI-aGVjaz8gTm8gYXR0YWNobWVudD88L2Rpdj4NCg=="
        }
       }
      ]
     }
    }
    

    Mail with attachment:

    {
     "payload": {
      "parts": [
       {
        "mimeType": "multipart/alternative",
        "filename": "",
        "headers": [
         {
          "name": "Content-Type",
          "value": "multipart/alternative; boundary=001a1142e23c551e8e05200b4be0"
         }
        ],
        "body": {
         "size": 0
        },
        "parts": [
         {
          "partId": "0.0",
          "mimeType": "text/plain",
          "filename": "",
          "headers": [
           {
            "name": "Content-Type",
            "value": "text/plain; charset=UTF-8"
           }
          ],
          "body": {
           "size": 9,
           "data": "V293IG1hbg0K"
          }
         },
         {
          "partId": "0.1",
          "mimeType": "text/html",
          "filename": "",
          "headers": [
           {
            "name": "Content-Type",
            "value": "text/html; charset=UTF-8"
           }
          ],
          "body": {
           "size": 30,
           "data": "PGRpdiBkaXI9Imx0ciI-V293IG1hbjwvZGl2Pg0K"
          }
         }
        ]
       },
       {
        "partId": "1",
        "mimeType": "image/jpeg",
        "filename": "feelthebern.jpg",
        "headers": [
         {
          "name": "Content-Type",
          "value": "image/jpeg; name=\"feelthebern.jpg\""
         },
         {
          "name": "Content-Disposition",
          "value": "attachment; filename=\"feelthebern.jpg\""
         },
         {
          "name": "Content-Transfer-Encoding",
          "value": "base64"
         },
         {
          "name": "X-Attachment-Id",
          "value": "f_ieq3ev0i0"
         }
        ],
        "body": {
         "attachmentId": "ANGjdJ_2xG3WOiLh6MbUdYy4vo2VhV2kOso5AyuJW3333rbmk8BIE1GJHIOXkNIVGiphP3fGe7iuIl_MGzXBGNGvNslwlz8hOkvJZg2DaasVZsdVFT_5JGvJOLefgaSL4hqKJgtzOZG9K1XSMrRQAtz2V0NX7puPdXDU4gvalSuMRGwBhr_oDSfx2xljHEbGG6I4VLeLZfrzGGKW7BF-GO_FUxzJR8SizRYqIhgZNA6PfRGyOhf1s7bAPNW3M9KqWRgaK07WTOYl7DzW4hpNBPA4jrl7tgsssExHpfviFL7yL52lxsmbsiLe81Z5UoM",
         "size": 100446
        }
       }
      ]
     }
    }
    

    These responses corresponds to the $parts in your code. As you can see, if you are lucky, $parts[0]['body']->data will give you what you want, but most of the time it will not.

    There are generally two approaches to this problem. You could implement the following algorithm (you are much better at PHP than me, but this is the general outline of it):

    1. Traverse the payload.parts and check if it contains a part that has the body you were looking for (either text/plain or text/html). If it has, you are done with your searching. If you were parsing a mail like the one above with no attachment, this would be enough.
    2. Do step 1 again, but this time with the parts found inside the parts you just checked, recursively. You will eventually find your part. If you were parsing a mail like the one above with an attachment, this would eventually find you your body.

    The algorithm could look something like the following (example in JavaScript):

    var response = {
     "payload": {
      "parts": [
       {
        "mimeType": "multipart/alternative",
        "filename": "",
        "headers": [
         {
          "name": "Content-Type",
          "value": "multipart/alternative; boundary=001a1142e23c551e8e05200b4be0"
         }
        ],
        "body": {
         "size": 0
        },
        "parts": [
         {
          "partId": "0.0",
          "mimeType": "text/plain",
          "filename": "",
          "headers": [
           {
            "name": "Content-Type",
            "value": "text/plain; charset=UTF-8"
           }
          ],
          "body": {
           "size": 9,
           "data": "V293IG1hbg0K"
          }
         },
         {
          "partId": "0.1",
          "mimeType": "text/html",
          "filename": "",
          "headers": [
           {
            "name": "Content-Type",
            "value": "text/html; charset=UTF-8"
           }
          ],
          "body": {
           "size": 30,
           "data": "PGRpdiBkaXI9Imx0ciI-V293IG1hbjwvZGl2Pg0K"
          }
         }
        ]
       },
       {
        "partId": "1",
        "mimeType": "image/jpeg",
        "filename": "feelthebern.jpg",
        "headers": [
         {
          "name": "Content-Type",
          "value": "image/jpeg; name=\"feelthebern.jpg\""
         },
         {
          "name": "Content-Disposition",
          "value": "attachment; filename=\"feelthebern.jpg\""
         },
         {
          "name": "Content-Transfer-Encoding",
          "value": "base64"
         },
         {
          "name": "X-Attachment-Id",
          "value": "f_ieq3ev0i0"
         }
        ],
        "body": {
         "attachmentId": "ANGjdJ_2xG3WOiLh6MbUdYy4vo2VhV2kOso5AyuJW3333rbmk8BIE1GJHIOXkNIVGiphP3fGe7iuIl_MGzXBGNGvNslwlz8hOkvJZg2DaasVZsdVFT_5JGvJOLefgaSL4hqKJgtzOZG9K1XSMrRQAtz2V0NX7puPdXDU4gvalSuMRGwBhr_oDSfx2xljHEbGG6I4VLeLZfrzGGKW7BF-GO_FUxzJR8SizRYqIhgZNA6PfRGyOhf1s7bAPNW3M9KqWRgaK07WTOYl7DzW4hpNBPA4jrl7tgsssExHpfviFL7yL52lxsmbsiLe81Z5UoM",
         "size": 100446
        }
       }
      ]
     }
    };
    
    // In e.g. a plain text message, the payload is the only part.
    var parts = [response.payload];
    
    while (parts.length) {
      var part = parts.shift();
      if (part.parts) {
        parts = parts.concat(part.parts);
      }
    
      if(part.mimeType === 'text/html') {
        var decodedPart = decodeURIComponent(escape(atob(part.body.data.replace(/\-/g, '+').replace(/\_/g, '/'))));
        console.log(decodedPart);
      }
    }

    The far easier option is to just get the raw data of the mail, and let a already written library do the work for you:

    Request:

    format = raw
    fields = raw
    
    GET https://www.googleapis.com/gmail/v1/users/me/messages/14fe21fd6b3fb46f?format=raw&fields=raw
    

    Response:

    {
     "raw": "TUlNRS1WZXJzaW9uOiAxLjANClJlY2VpdmVkOiBieSAxMC4yOC45OS4xOTYgd2l0aCBIVFRQOyBGcmksIDE4IFNlcCAyMDE1IDEzOjIzOjAxIC0wNzAwIChQRFQpDQpEYXRlOiBGcmksIDE4IFNlcCAyMDE1IDIyOjIzOjAxICswMjAwDQpEZWxpdmVyZWQtVG86IGVtdGhvbGluQGdtYWlsLmNvbQ0KTWVzc2FnZS1JRDogPENBRHNaTFJ5eGk2UGt0MHZnUS1iZHd1N2FNLWNHRmZKcEwrRHYyb3ZKOGp4SGN4VWhfQUBtYWlsLmdtYWlsLmNvbT4NClN1YmplY3Q6IFdoYXQgZGENCkZyb206IEVtaWwgVGhvbGluIDxlbXRob2xpbkBnbWFpbC5jb20-DQpUbzogRW1pbCBUaG9saW4gPGVtdGhvbGluQGdtYWlsLmNvbT4NCkNvbnRlbnQtVHlwZTogbXVsdGlwYXJ0L2FsdGVybmF0aXZlOyBib3VuZGFyeT0wMDFhMTE0NjhmMTY1YzUwNDUwNTIwMGI0YzYxDQoNCi0tMDAxYTExNDY4ZjE2NWM1MDQ1MDUyMDBiNGM2MQ0KQ29udGVudC1UeXBlOiB0ZXh0L3BsYWluOyBjaGFyc2V0PVVURi04DQoNCmhlY2s_IE5vIGF0dGFjaG1lbnQ_DQoNCi0tMDAxYTExNDY4ZjE2NWM1MDQ1MDUyMDBiNGM2MQ0KQ29udGVudC1UeXBlOiB0ZXh0L2h0bWw7IGNoYXJzZXQ9VVRGLTgNCg0KPGRpdiBkaXI9Imx0ciI-aGVjaz8gTm8gYXR0YWNobWVudD88L2Rpdj4NCg0KLS0wMDFhMTE0NjhmMTY1YzUwNDUwNTIwMGI0YzYxLS0="
    }
    

    The biggest drawback of the second method is that if you get the message raw, you will download all the attachment data right away, which might be far to much data for your use case.

    I'm not good at PHP, but this looks promising if you want to go with the second solution! Good luck!