Say for example I have this text:
hello world **ant*** lorem **cat** opposum** *** ***antelope*** *rabbit __dog__
I would like to match strings that only have **
and __
as its preceding and concluding characters. So in the case above, the matches that I would only want are "cat" and "dog". This means that I have to cancel or negate the match if there are extra surrounding characters. For example, ***dog**
or __dog___
should fail.
I've tried to solve this using a negative look around http://www.regular-expressions.info/lookaround.html to no avail.
Here's the current pattern I have
const pattern = /([^*])\*(\w+)\*([^*])/g;
const match = pattern.exec(text);
const annotatedText = match[0];
const matchedText = match[1];
// Return if annotatedText is a possible match for bolditalic
if (annotatedText.startsWith("***") || annotatedText.startsWith("___")) {
return;
}
// Return if the matchedText has spaces in between
if (/\s/.test(matchedText)) {
return;
}
if (text.match(/^([*_ \n]+)$/g)) {
return;
}
in javascript regex,
Essentially, I want to remove the javascript string checks and add the logic on the regex pattern itself.
Use
/(?<=(?<!\*)\*\*)\w+(?=\*\*(?!\*))|(?<=(?<!_)__)\w+(?=__(?!_))/gi
See proof.
Explanation
--------------------------------------------------------------------------------
(?<= look behind to see if there is:
--------------------------------------------------------------------------------
(?<! look behind to see if there is not:
--------------------------------------------------------------------------------
\* '*'
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
\* '*'
--------------------------------------------------------------------------------
\* '*'
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
\* '*'
--------------------------------------------------------------------------------
\* '*'
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
\* '*'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
(?<= look behind to see if there is:
--------------------------------------------------------------------------------
(?<! look behind to see if there is not:
--------------------------------------------------------------------------------
_ '_'
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
__ '__'
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
__ '__'
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
_ '_'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
) end of look-ahead
JavaScript code:
const string = 'hello world **ant*** lorem **cat** opposum** *** ***antelope*** *rabbit __dog__';
console.log(string.match(/(?<=(?<!\*)\*\*)\w+(?=\*\*(?!\*))|(?<=(?<!_)__)\w+(?=__(?!_))/gi))