I'm trying to do a complex search for strings in a document using a JavaScript web add-in for Word. It's working okay, searching for strings beginning with "XYZ " and then having a string of 1 to 10 alphanumeric characters, a period and another string of 1 to 10 alphanumeric characters. The search string:
body.search('XYZ [0-9a-zA-Z]{1,10}.[0-9a-zA-Z]{1,10}', { matchWildcards: true });
...finds most of them but misses some because it doesn't recognize the hard-coded blank. If I search instead for a string like:
('XYZ^w^#^#^#.^#^#^#', { matchWildcards: false});
...using search notation for special characters (specifically ^w for whitespace) then that will catch all the whitespaces, but is too specific for a practical search.
Whenever I try to combine search notation with wildcards, or even if I just set matchWildcards to true in the above, I get a general exception. Is there no way to combine these terms or to otherwise designate a whitespace with wildcards enabled without hard coding the white space?
NOTE: I've looked carefully at the actual characters, expecting some unicode difference to be the culprit. I've even opened up the document and parsed the XML. I can't find a difference in the actual characters themselves, although there is some difference in the XML.
Figured it out, although I'd still like an explanation for why you can't use special character search notation with matchWildCards set to true.
In my case, as it turned out, the reason some whitespaces were being missed is because some were non-breaking blank spaces (160) instead of regular blank spaces (32). I solved it by looking for either as the fourth character of the search string thusly:
var sSearchString = "XYZ[" + String.fromCharCode(32, 160) + "][0-9a-zA-Z]{1,10}.[0-9a-zA-Z]{1,10}";
searchResults = body.search(sSearchString, { matchWildcards: true });