Search code examples
google-apps-scriptgoogle-sheetsgoogle-people-api

Duplicate contact creation using appscript despite a functional filter function


Context

A bit of context before we can dive into the code: I am currently working for a non-profit organisation for the protection of cats. I'm not a pro developer, I'm not paid to work on this, but since I'm the only one willing to do it and who knows a bit how to code, I volunteered to write a script for creating and updating adopter and abandoner contacts for our cats.

The other volunteers in the organisation are using Google Sheets to keep track of lots of information about the cats, including their adopters' and abandoners' contact information. On the other hand, the person in charge of the organisation wants to have every adopter or abandoner contact in a specific format in her Google Contacts. Before I wrote the script, volunteers used to enter all the info in the spreadsheet and enter it again in the boss' contacts.

The script is mostly functional and handles the update of contact information as well. However, the script creates duplicate contacts for some people, and I don't really understand why (although I may have a lead). It is a bug which only happens when volunteers use the script, but not when I use it; which makes me think something goes wrong when they call the script...

Code

The script creates an array of Person objects. Every person has a method to return a contactObject version of itself, compatible with Google's People API and I use People API's batchUpdateContacts function to create contacts, by batches of 200 while there are new contacts in the array.

In order to know the contacts already created, I first get the created connections using this function:

/** Helper function to list all connections of the current google user
 * 
 * @returns {PeopleAPI.Person[]} connections - All of the connection objects from Google's People API
 */
/** Helper function to list all connections of the current google user
 * 
 * @returns {PeopleAPI.Person[]} connections - All of the connection objects from Google's People API
*/
function getAllConnections_() {
  var connections = [];
  var apiResponse;
  var nextPageToken;
  var firstPass = true;
  do {
    if (firstPass) {
      apiResponse = People.People.Connections.list('people/me', {'personFields': 'memberships,emailAddresses,phoneNumbers,names,addresses,biographies,userDefined'});
      firstPass = false;
    }
    else {
      apiResponse = People.People.Connections.list('people/me', {'personFields': 'memberships,emailAddresses,phoneNumbers,names,addresses,biographies,userDefined', 'pageToken': nextPageToken});
    }
    connections = connections.concat(apiResponse.connections);
    nextPageToken = apiResponse.nextPageToken;
  } while (nextPageToken);
  return connections;
}

Then, I use a filter function to eliminate the already existing contacts based on the contacts email addresses (when a cat is adopted, we always ask for 2 email addresses, so I know there is at least one):

/** Helper function to filter the existing contacts and avoid creating them
 * 
 * @param {Person[]} people - people to filter from
 * @param {connections[]} connections - existing contacts in person's address book
 * @returns {Person[]} filteredPeople - people who are not in connections
 */
function filterExistingContacts_(people, connections) {
  if (!connections[0]) {
    return people;
  }
  return people.filter(function (person) {
    for (contact of connections) {
      if (!contact.emailAddresses) {continue;}
      if (contact.emailAddresses.filter(function (email) {return email.value.toLowerCase().replace(/\s/g, '').includes(person.email)}).length > 0) {return false;}
    }
    return true;
  });
}

In the above code, person.email is lowercased and spaces are replaced by ''. When I run those functions, I can't reproduce the bug, but when the script users do, they get any number from 2 to 74 duplicate contacts created.

Supposition and leads

My supposition is that, maybe, the "getAllConnections_" function gets a bad response from Google's People API, for some reason and thus, gets an incomplete array of existing connections. Then my filter function filters correctly (since I can see no fault in my logic here) the contacts, but some existing contacts are re-created because the script is not aware they already exist.

First idea

If this is so, I think possibly a SQL database could solve the problem (and lower the complexity of the algorithm, which is quite slow with the current ~4000 existing contacts). But I don't really know where I could find a free database (for the organisation would much prefer paying for veterinary care than for this) which would function with Appscript ; plus that would mean a lot of work on the code itself to adapt it... So I would like to know if you guys think it may solve the problem or if I'm completely mistaken before I give it some more hours.

Second idea

Also, I thought about using something like the "ADDED" trick described here: Delete duplicated or multiplied contacts from Google Contacts as a workaround... But the spreadsheet is not structured per contact, but per cat. So it would lead to a problem for a specific situation which is, actually and sadly, quite frequent:

  1. Patrick Shmotherby adopts the cat Smoochie → Smoochie's adopter column is marked as "ADDED" and Patrick's contact is created.
  2. Patrick Shmotherby later abandons Smoochie → Smoochie's abandoner column is marked as "ADDED" and Patrick's contact is updated.
  3. Karen Klupstutsy later adopts Smoochie → Smoochie's adopter column is already marked as "ADDED" so Karen's contact is not created.

A solution could be asking the volunteers to delete the "ADDED" marker manually, yet I think you can understand why this is error-prone when updating lots of contacts on the same day and time-consuming for the volunteers.

Third idea

I thought I might create a function for auto-deleting duplicate contacts from the Google account, but I would prefer not to use this solution as I'm afraid I could delete some important data there, especially as this is the boss' professional, organisational and personal account.

How you could help me

I must say, despite my leads, I'm a bit lost and confused by these duplicates, especially since I can't debug anything because I can't reproduce the bug myself. If you have any fourth lead, I would welcome it warmly.

Also, and because I'm a hobbyist, it's very likely that I didn't do things the correct way, or did not know I could do something else (e.g. I suggested using a SQL database because I know of the existence of relational databases, but maybe there are other common tools I've never heard of). So any suggestion would be good too.

Finally, if you think I'm correct on my own diagnosis, telling me so could help me get the motivation to re-write my code almost entirely if needed. And if you know where I could find a free database usable with Google Appscript (I know quality has a price, so I don't have much hope for this, but we never know) and if it's not "host your own database in you basement", that would be awesome!

Tell me if you need more information, if you want me to put some other piece of code or anything.

Have a nice day/afternoon/evening/night, Benjamin


Solution

  • Alright so I found where the problem was from, thanks to @OctaviaSima who pointed me to the executions logs. Apparently, for some reason I don't know, sometimes, my function "getAllConnections_()" which was supposed to get all the contacts in the Google Contacts book failed to get some contacts using this code:

    /** Helper function to list all connections of the current google user
     * 
     * @returns {PeopleAPI.Person[]} connections - All of the connection objects from Google's People API
     */
    function getAllConnections_() {
      var connections = [];
      var apiResponse;
      var nextPageToken;
      var firstPass = true;
      do {
        if (firstPass) {
          apiResponse = People.People.Connections.list('people/me', {'personFields': 'memberships,emailAddresses,phoneNumbers,names,addresses,biographies,userDefined'});
          firstPass = false;
        }
        else {
          apiResponse = People.People.Connections.list('people/me', {'personFields': 'memberships,emailAddresses,phoneNumbers,names,addresses,biographies,userDefined', 'pageToken': nextPageToken});
        }
        connections = connections.concat(apiResponse.connections);
        nextPageToken = apiResponse.nextPageToken;
      } while (nextPageToken);
      return connections;
    }
    

    E.g. last execution, the actual contact list was 4061 connections long, however the script only got 4056 connections, which led to 5 duplicate contacts being created. I added a quick patch by ensuring the connections table was as long as the number of contacts by calling the function recursively if it's not the case.

    /** Helper function to list all connections of the current google user
     * 
     * @returns {PeopleAPI.Person[]} connections - All of the connection objects from Google's People API
     */
    function getAllConnections_() {
      var connections = [];
      var apiResponse;
      var nextPageToken;
      var firstPass = true;
      do {
        if (firstPass) {
          apiResponse = People.People.Connections.list('people/me', {'personFields': 'memberships,emailAddresses,phoneNumbers,names,addresses,biographies,userDefined'});
          firstPass = false;
        }
        else {
          apiResponse = People.People.Connections.list('people/me', {'personFields': 'memberships,emailAddresses,phoneNumbers,names,addresses,biographies,userDefined', 'pageToken': nextPageToken});
        }
        connections = connections.concat(apiResponse.connections);
        nextPageToken = apiResponse.nextPageToken;
      } while (nextPageToken);
      if (connections.length != apiResponse.totalItems) {connections = getAllConnections_();} // Hopefully, the magic lies in this line of code
      return connections;
    }
    

    Posting this here in case it helps someone else.

    Edit: Just corrected the test on the magic line from "==" to "!=".