Given this vector:
ba <- c('baa','aba','abba','abbba','aaba','aabba')'
I want to change the final a
of each word to i
except baa
and aba
.
I wrote the following line ...
gsub('(?<=a[ab]b{1,2})a','i',ba,perl=T)
but was told: PCRE pattern compilation error 'lookbehind assertion is not fixed length' at ')a'.
I looked around a little bit and apparently R/Perl can only lookahead for a variable width, not lookbehind. Any workaround to this problem? Thanks!
You can use the lookbehind alternative \K
instead. This escape sequence resets the starting point of the reported match and any previously consumed characters are no longer included.
Quoted — rexegg
The key difference between \K and a lookbehind is that in PCRE, a lookbehind does not allow you to use quantifiers: the length of what you look for must be fixed. On the other hand, \K can be dropped anywhere in a pattern, so you are free to have any quantifiers you like before \K.
Using it in context:
sub('a[ab]b{1,2}\\Ka', 'i', ba, perl=T)
# [1] "baa" "aba" "abbi" "abbbi" "aabi" "aabbi"
Avoiding lookarounds:
sub('(a[ab]b{1,2})a', '\\1i', ba)
# [1] "baa" "aba" "abbi" "abbbi" "aabi" "aabbi"