I'm trying to use a jq v1.6
filter that has a regex, which contains a negative look-behind and negative look-ahead expressions, but they are failing with Regex failure: invalid pattern in look-behind
, even though the spec seems like it would be a valid expression.
The command I'm using is
$ jq -n '("baz", "foo baz", "bla baz", "baz bars") | test("(?<!foo |bars )baz(?! foo| bars)")'
jq: error (at <unknown>): Regex failure: invalid pattern in look-behind
It seems like jq 1.6
is using the Onigurama library version 5.9.6 (https://github.com/stedolan/jq/commit/61edf3fa93f6177ef099b1b0cb2b49813a35c546#diff-ea6712465e6d2ae84a07da73f4ad6e25, this seems the right script version because jq 1.6
was released on Nov 2018, and the next commit for compile-ios.sh
is until Dec 2019).
Now, Oniguruma 5.9.6 closest documentation I could find is from 5.9.1 (in https://github.com/kkos/oniguruma/commit/65a9b1aa03c9bc2dc01b074295b9603232cb3b78# you can search for negative look-behind
, line 221 of doc/RE
file.
(?<!subexp) negative look-behind
Subexp of look-behind must be fixed character length. But different character length is allowed in top level alternatives only. ex. (?<=a|bc) is OK. (?<=aaa(?:b|cd)) is not allowed.
In negative-look-behind, captured group isn't allowed, but shy group(?:) is allowed.
So it seems like my expression should work.
After testing a few things, I found out that this works:
jq -n '("baz", "foo baz", "bla baz", "baz bars") | test("(?<!foo|bar)baz(?! foo| bars)")'
The only difference is that the look-behind expression alternatives are fixed width, but the docs clearly state that top level alternatives are allowed to have variable width.
It seems that for some reason, this particular version of jq
does not support variable-width alternatives in a (negative) look-behind expression, even though the spec says nothing about this.
I suspect something is going on with the particular jq
build I installed, because if I try to run the regex example in https://stedolan.github.io/jq/manual/#RegularexpressionsPCRE in also get an error:
$ jq -n '("test", "TEst", "teST", "TEST") | test( "(?i)te(?-i)st" )'
jq: error (at <unknown>): Regex failure: invalid group name <>
Does anyone have any idea what could be wrong?
If your current library version is limited to fixed-width lookbehind patterns, you can't do much about it.
In your case, since you are using negative lookbehinds, you can do without alternation, just split the lookbehind into two:
(?<!foo )(?<!bars )baz(?! foo| bars)
^^^^^^^^^^^^^^^^^^^
Then, you do not have to care how many chars each lookbehind has to check.