How do I capture a word "entity", when it's not followed by hyphens, underscores, and alphanumeric, and ignores anything else that follows it?
For example, I want to capture the word "entity" in the following situations:
entity
entity,
[entity]
But I do NOT want it to capture the word in the following situations:
entity-foo
entity_bar
entityfoobar
entity0foo
The furthest I got to is:
(entity)[^-\$a-zA-Z_0-9]
However, the above regex identifies:
entity,
without ignoring the ,
entity]
without ignoring ]
I'm trying to capture this token in a Sublime Syntax definition.
Sounds like a job for lookaheads!
Something like this should work:
(entity)(?=[\s,\]])
(?<=\[)?
: The (?<=regex)
construct is a lookbehind. We make it optional by using a trailing ?
. This lookbehind looks for a [
character in front of our regex(entity)
: Matching the phrase entity
and capturing it(?=[\s,\]])
: A lookahead ((?=regex)
), looking for any of \s
, ,
and ]
. \s
in RegEx matches a whitespace character, which includes spaces, tabs, newlines, etc.One caveat of my pattern is that the phrase entity]
will be matched, without the leading [
, which isn't specified in your examples. This can potentially be expanded further, but it will begin to get messy, and may not be necessary, anyway.