I am writing a script for a Google document that counts words and highlights them. The script works, but not quite as it should. Parts of words should not be counted and highlighted. For example, I am looking for the word cop, if there is a word robocop - skip it.
I tried regular expression with the word "me", but seems it doesn't fit, as I need to go through the text, highlighting words along the way. But maybe I just don’t understand how to do it right.
function findWords2(keys) {
var body = doc.getBody();
var keysMap = {}; // object for keys with quantity
// For every word in keys:
for (var w = 0; w < keys.length; ++w) {
// Get the current word:
//var rx = /(.){1}me(.){1}/;
//var foundElement = rx.exec(doc.getBody().getText());
//var foundElement = body.findText(rx);
var foundElement = body.findText(keys[w]);
var count = 0;
while (foundElement != null) {
// Get the text object from the element
var foundText = foundElement.getElement().asText();
count++;
// Where in the Element is the found text?
var start = foundElement.getStartOffset();
var end = foundElement.getEndOffsetInclusive();
// Change the background color to yellow
foundText.setBackgroundColor(start, end, "#FCFC00");
// Find the next match
foundElement = body.findText(keys[w], foundElement);
}
keysMap[keys[w]] = count; // add current searched keyword to keysMap with quantity
}
return JSON.stringify(keysMap, null, 1);
}
So, if we call findWords('cop') in text "Robocop cop cop", we found and highlighted cop 3 times, instead of two. In theory, I just need to check the previous and subsequent characters of the found word, but how to do it?
You should use word boundary\b
:
\bcop\b
Note that body.findText()
receives regex as string. So, You should escape \
:
body.findText("\\bcop\\b")
If you're searching plain string, (using regexp.exec),
/\bcop\b/g