I'm creating a Gmail Add On that sends certain information of the emails to a backend endpoint and in order to do that it must select and format the email addresses of the emails in the following way:
GmailMessage.getFrom()
or GmailMessage.getTo()
is a string with the format of an email, that's the address and no format should be performed.GmailMessage.getFrom()
or GmailMessage.getTo()
is a string with the format of, for example, John Doe <john@doe.com>
, then the address should be the substring between the angle brackets.So I wrote the following code for that
for (var i = 0; i < messages.length; i++) {
var address = '';
var name = '';
var from = messages[i].getFrom()
var to = messages[i].getTo();
Logger.log(from);
Logger.log(to);
if (messages[i].isInInbox()) {
Logger.log('inbox'); // (*)
if (/<(.*?)>/g.test(from)) {
Logger.log(/<(.*?)>/g.test(from));
Logger.log('true');
address = /<(.*?)>/.exec(from)[1];
} else {
Logger.log(/<(.*?)>/g.test(from)); // (**)
Logger.log('false');
address = /^(([^<>()\[\]\\.,;:\s@"]+(\.[^<>()\[\]\\.,;:\s@"]+)*)|(".+"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/g.exec(from)[0];
}
name = /^(.*?)@/.exec(address);
} else {
Logger.log('sent');
if (/<(.*?)>/g.test(to)) {
Logger.log(/<(.*?)>/g.test(to));
Logger.log('true');
address = /<(.*?)>/.exec(to)[1];
} else {
Logger.log(/<(.*?)>/g.test(to));
Logger.log('false');
address = /^(([^<>()\[\]\\.,;:\s@"]+(\.[^<>()\[\]\\.,;:\s@"]+)*)|(".+"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/g.exec(to)[0];
}
name = /^(.*?)@/.exec(address);
}
}
Where messages is an array of GmailMessage objects.
The thing is that in one of the Inbox messages the 'from' address is evaluating true for the /<(.*?)>/g
regular expression but entering into the else statement nonetheless.
To put it in a more clear way, the output of the Logger is the following:
[19-10-14 13:52:14:750 PDT] "Foo Bar" <foo@bar.com>
[19-10-14 13:52:14:750 PDT] john@doe.com <---- selected address
[19-10-14 13:52:14:751 PDT] sent
[19-10-14 13:52:14:751 PDT] false <---- it's a "simple string", not enclosed by <>
[19-10-14 13:52:14:752 PDT] false
[19-10-14 13:52:14:753 PDT] John Doe <john@doe.com> <---- selected address
[19-10-14 13:52:14:754 PDT] "Foo Bar" <foo@bar.com>
[19-10-14 13:52:14:755 PDT] inbox
[19-10-14 13:52:14:755 PDT] true <---- enclosed by <>
[19-10-14 13:52:14:755 PDT] true
[19-10-14 13:52:14:757 PDT] "Foo Bar" <foo@bar.com>
[19-10-14 13:52:14:757 PDT] John Doe <john@doe.com> <---- selected address
[19-10-14 13:52:14:757 PDT] sent
[19-10-14 13:52:14:758 PDT] true <---- enclosed by <>
[19-10-14 13:52:14:758 PDT] true
[19-10-14 13:52:14:760 PDT] John Doe <john@doe.com> <---- selected address
[19-10-14 13:52:14:760 PDT] "Foo Bar" <foo@bar.com>
[19-10-14 13:52:14:761 PDT] inbox
[19-10-14 13:52:14:761 PDT] true <---- enclosed by <>
[19-10-14 13:52:14:761 PDT] false <---- it's entering to the else statement nonetheless
Any clue about why is this strange behaviour happening? I tried really hard to figure out what is wrong about this code but I really don't know what could it be.
Edit: I made some more tests and strangely the expression /<(.*?)>/g.exec(from)
is equal to null
if I place it below Logger.log('inbox')
(*) and equal to [<john@doe.com>, john@doe.com]
inside the else statement (**). Could anyone explain this behaviour?
It's the global modifier that is causing the problem. Regexs with the global modifier are stateful between calls, including calls to test(). The maintaining of state is so things like exec() can be called iteratively to return the next match. Since you're using literals for the expressions the interpreter just optimizes that to one instance of the expression. Same instance + stateful + multiple calls to test() with same arg == what you're seeing.
Remove the g
flag in those expressions and it'll be fine. It's not even needed in the way you're using it since you're testing for any match.