A bit of context before we can dive into the code: I am currently working for a non-profit organisation for the protection of cats. I'm not a pro developer, I'm not paid to work on this, but since I'm the only one willing to do it and who knows a bit how to code, I volunteered to write a script for creating and updating adopter and abandoner contacts for our cats.
The other volunteers in the organisation are using Google Sheets to keep track of lots of information about the cats, including their adopters' and abandoners' contact information. On the other hand, the person in charge of the organisation wants to have every adopter or abandoner contact in a specific format in her Google Contacts. Before I wrote the script, volunteers used to enter all the info in the spreadsheet and enter it again in the boss' contacts.
The script is mostly functional and handles the update of contact information as well. However, the script creates duplicate contacts for some people, and I don't really understand why (although I may have a lead). It is a bug which only happens when volunteers use the script, but not when I use it; which makes me think something goes wrong when they call the script...
The script creates an array of Person objects. Every person has a method to return a contactObject version of itself, compatible with Google's People API and I use People API's batchUpdateContacts function to create contacts, by batches of 200 while there are new contacts in the array.
In order to know the contacts already created, I first get the created connections using this function:
/** Helper function to list all connections of the current google user
*
* @returns {PeopleAPI.Person[]} connections - All of the connection objects from Google's People API
*/
/** Helper function to list all connections of the current google user
*
* @returns {PeopleAPI.Person[]} connections - All of the connection objects from Google's People API
*/
function getAllConnections_() {
var connections = [];
var apiResponse;
var nextPageToken;
var firstPass = true;
do {
if (firstPass) {
apiResponse = People.People.Connections.list('people/me', {'personFields': 'memberships,emailAddresses,phoneNumbers,names,addresses,biographies,userDefined'});
firstPass = false;
}
else {
apiResponse = People.People.Connections.list('people/me', {'personFields': 'memberships,emailAddresses,phoneNumbers,names,addresses,biographies,userDefined', 'pageToken': nextPageToken});
}
connections = connections.concat(apiResponse.connections);
nextPageToken = apiResponse.nextPageToken;
} while (nextPageToken);
return connections;
}
Then, I use a filter function to eliminate the already existing contacts based on the contacts email addresses (when a cat is adopted, we always ask for 2 email addresses, so I know there is at least one):
/** Helper function to filter the existing contacts and avoid creating them
*
* @param {Person[]} people - people to filter from
* @param {connections[]} connections - existing contacts in person's address book
* @returns {Person[]} filteredPeople - people who are not in connections
*/
function filterExistingContacts_(people, connections) {
if (!connections[0]) {
return people;
}
return people.filter(function (person) {
for (contact of connections) {
if (!contact.emailAddresses) {continue;}
if (contact.emailAddresses.filter(function (email) {return email.value.toLowerCase().replace(/\s/g, '').includes(person.email)}).length > 0) {return false;}
}
return true;
});
}
In the above code, person.email is lowercased and spaces are replaced by ''. When I run those functions, I can't reproduce the bug, but when the script users do, they get any number from 2 to 74 duplicate contacts created.
My supposition is that, maybe, the "getAllConnections_" function gets a bad response from Google's People API, for some reason and thus, gets an incomplete array of existing connections. Then my filter function filters correctly (since I can see no fault in my logic here) the contacts, but some existing contacts are re-created because the script is not aware they already exist.
If this is so, I think possibly a SQL database could solve the problem (and lower the complexity of the algorithm, which is quite slow with the current ~4000 existing contacts). But I don't really know where I could find a free database (for the organisation would much prefer paying for veterinary care than for this) which would function with Appscript ; plus that would mean a lot of work on the code itself to adapt it... So I would like to know if you guys think it may solve the problem or if I'm completely mistaken before I give it some more hours.
Also, I thought about using something like the "ADDED" trick described here: Delete duplicated or multiplied contacts from Google Contacts as a workaround... But the spreadsheet is not structured per contact, but per cat. So it would lead to a problem for a specific situation which is, actually and sadly, quite frequent:
A solution could be asking the volunteers to delete the "ADDED" marker manually, yet I think you can understand why this is error-prone when updating lots of contacts on the same day and time-consuming for the volunteers.
I thought I might create a function for auto-deleting duplicate contacts from the Google account, but I would prefer not to use this solution as I'm afraid I could delete some important data there, especially as this is the boss' professional, organisational and personal account.
I must say, despite my leads, I'm a bit lost and confused by these duplicates, especially since I can't debug anything because I can't reproduce the bug myself. If you have any fourth lead, I would welcome it warmly.
Also, and because I'm a hobbyist, it's very likely that I didn't do things the correct way, or did not know I could do something else (e.g. I suggested using a SQL database because I know of the existence of relational databases, but maybe there are other common tools I've never heard of). So any suggestion would be good too.
Finally, if you think I'm correct on my own diagnosis, telling me so could help me get the motivation to re-write my code almost entirely if needed. And if you know where I could find a free database usable with Google Appscript (I know quality has a price, so I don't have much hope for this, but we never know) and if it's not "host your own database in you basement", that would be awesome!
Tell me if you need more information, if you want me to put some other piece of code or anything.
Have a nice day/afternoon/evening/night, Benjamin
Alright so I found where the problem was from, thanks to @OctaviaSima who pointed me to the executions logs. Apparently, for some reason I don't know, sometimes, my function "getAllConnections_()" which was supposed to get all the contacts in the Google Contacts book failed to get some contacts using this code:
/** Helper function to list all connections of the current google user
*
* @returns {PeopleAPI.Person[]} connections - All of the connection objects from Google's People API
*/
function getAllConnections_() {
var connections = [];
var apiResponse;
var nextPageToken;
var firstPass = true;
do {
if (firstPass) {
apiResponse = People.People.Connections.list('people/me', {'personFields': 'memberships,emailAddresses,phoneNumbers,names,addresses,biographies,userDefined'});
firstPass = false;
}
else {
apiResponse = People.People.Connections.list('people/me', {'personFields': 'memberships,emailAddresses,phoneNumbers,names,addresses,biographies,userDefined', 'pageToken': nextPageToken});
}
connections = connections.concat(apiResponse.connections);
nextPageToken = apiResponse.nextPageToken;
} while (nextPageToken);
return connections;
}
E.g. last execution, the actual contact list was 4061 connections long, however the script only got 4056 connections, which led to 5 duplicate contacts being created. I added a quick patch by ensuring the connections table was as long as the number of contacts by calling the function recursively if it's not the case.
/** Helper function to list all connections of the current google user
*
* @returns {PeopleAPI.Person[]} connections - All of the connection objects from Google's People API
*/
function getAllConnections_() {
var connections = [];
var apiResponse;
var nextPageToken;
var firstPass = true;
do {
if (firstPass) {
apiResponse = People.People.Connections.list('people/me', {'personFields': 'memberships,emailAddresses,phoneNumbers,names,addresses,biographies,userDefined'});
firstPass = false;
}
else {
apiResponse = People.People.Connections.list('people/me', {'personFields': 'memberships,emailAddresses,phoneNumbers,names,addresses,biographies,userDefined', 'pageToken': nextPageToken});
}
connections = connections.concat(apiResponse.connections);
nextPageToken = apiResponse.nextPageToken;
} while (nextPageToken);
if (connections.length != apiResponse.totalItems) {connections = getAllConnections_();} // Hopefully, the magic lies in this line of code
return connections;
}
Posting this here in case it helps someone else.
Edit: Just corrected the test on the magic line from "==" to "!=".