Search code examples
javascriptgoogle-apps-scriptgmail

Gmail script not deleting all messages - deletes some each time


I have written a script to delete all emails in a label that are from longer than 10 days ago. When I run the script, each time it deletes some emails, but not all of them by any stretch. I have about 7 labels, many of them with ~2k emails, the largest had about 18k.

function empty_mail() {

// get GmailLabels as variables
  var labelOne = GmailApp.getUserLabelByName("myLabel1");
  var labelTwo = GmailApp.getUserLabelByName("myLabel2");
  // some more labels ...
  var lastLabel = GmailApp.getUserLabelByName("myLastLabel");

  // array of GmailLabels (edited)
  var labels = [myLabel1, myLabel2, myLastLabel];

  var days_old = 10; // all messages older than 10 days will be deleted in these labels

  // set threshold date in past
  var threshold_date = new Date();
  threshold_date.setDate(threshold_date.getDate() - days_old);

  // format threshold date as string and log
  var mnth = threshold_date.getMonth() + 1;
  var dy = threshold_date.getDate();
  var yr = threshold_date.getFullYear();
  var strDate = mnth + "/" + dy + "/" + yr;
  Logger.log("Mail older than " + strDate + " will be deleted.");

  // run delete_mail function on each label
  for (var i=0; i < labels.length; i++){
    Logger.log("==== Deleting old mail in: " + labels[i].getName() + " ====");  
    delete_mail(labels[i], threshold_date);
  }
}

function delete_mail(theLabel, oldDate){
  var count = 0;
  var threads = theLabel.getThreads();
  // if message is older than threshold date, move it to trash
  for (var i = 0; i < threads.length; i++){
    if (threads[i].getLastMessageDate() < oldDate){
      threads[i].moveToTrash();
      count++;
    }
  }
  Logger.log("    - deleted " + count + " messages from " + theLabel.getName());
}

At first glance and some early tests this worked great. But I started logging when I noticed not all lables were being purged from and the logger is outputting things like this:

[15-06-30 15:38:56:487 PDT] Mail older than 6/20/2015 will be deleted.
[15-06-30 15:38:56:488 PDT] ==== Deleting old mail in: <label1> ====
[15-06-30 15:38:56:921 PDT]     - deleted 0 messages from <label1>
[15-06-30 15:38:56:921 PDT] ==== Deleting old mail in: <label2> ====
[15-06-30 15:38:57:626 PDT]     - deleted 0 messages from <label2>
[15-06-30 15:38:57:627 PDT] ==== Deleting old mail in: <label3> ====
[15-06-30 15:38:57:900 PDT]     - deleted 0 messages from <label3>
[15-06-30 15:38:57:901 PDT] ==== Deleting old mail in: <label4> ====
[15-06-30 15:39:02:593 PDT]     - deleted 0 messages from <label4>
[15-06-30 15:39:02:594 PDT] ==== Deleting old mail in: <label5> ====
[15-06-30 15:39:02:849 PDT]     - deleted 0 messages from <label5>
[15-06-30 15:39:02:850 PDT] ==== Deleting old mail in: <label6> ====
[15-06-30 15:39:30:078 PDT]     - deleted 24 messages from <label6>
[15-06-30 15:39:30:079 PDT] ==== Deleting old mail in: <label7> ====
[15-06-30 15:39:36:633 PDT]     - deleted 0 messages from <label7>

The issue is that in at least labels 4, 6, and 7, there are many (hundreds or thousands) of emails that should still be deleted. Can anyone see a bug in my code, is there some sort of timeout hitting, or has anyone else ran into something like this? Thanks!


Solution

  • I solved this problem by investigating a bit more into the code. In the gmail scripts documentation for the getThreads() function it has this line: "This call will fail when the size of all threads is too large for the system to handle."

    In my case, it turns out the call wasn't failing, but only returning the first 500 threads. To fix this I used the getThreads(start, max) call, as below.

    function delete_mail(theLabel, oldDate){
    
      var threadStart = 0;
      var threadMax = 100;
      var totalDeletedCount = 0;
      var totalThreadCount = 0;
      var pageDeleteCount = 0;
      // get 100 threads at a time
      do {
        var threads = theLabel.getThreads(threadStart, threadMax);
        for (var i = 0; i < threads.length; i++){
          if (threads[i].getLastMessageDate() < oldDate){
            threads[i].moveToTrash();
            totalDeletedCount++;
            pageDeleteCount++;
          }
        }
        threadStart += threadMax;        //start at next max
        threadStart -= pageDeleteCount;  //subtract threads deleted in loop
        pageDeleteCount = 0;
        totalThreadCount += threads.length;
      } while (threads.length == threadMax);
      // log using totalThreadCount and totalDeletedCount
    }
    

    This gets 100 threads at a time, then moves to the next thread by adding 100 and subtracting however many threads were just deleted.

    Other issues:

    • Still hitting a google quota for too many operations.
      • See all quota information here.
    • Google will timeout the script after 6 mins of run time.
    • Due to one/both of the above, the script isn't finishing.

    Possible fixes:

    • Split out into one script per label, and run seperately.
    • Keep each label to a more manageble size by deleting more mail.
    • Get the size of label first, then loop from the end and break at thread with a new enough date.
      • I used this last option and it has worked well, post if you would like to see a code sample.