How would you split at every and/ERT
only when it is not succeded by "/V" inside one word after in:
text <- c("faulty and/ERT something/VBN and/ERT else/VHGB and/ERT as/VVFIN and/ERT not else/VHGB propositions one and/ERT two/CDF and/ERT three/ABC")
# my try - !doesn't work
> strsplit(text, "(?<=and/ERT)\\s(?!./V.)", perl=TRUE)
^^^^
# Exptected return
[[1]]
[1] "faulty and/ERT something/VBN and/ERT else/VHGB and/ERT as/VVFIN and/ERT"
[2] "not else/VHGB propositions one and/ERT"
[3] "two/CDF and/ERT"
[4] "three/ABC"
Actually, you need to approach this in another way:
(?<=and/ERT)\\s(?!\\S+/V)
^^^^
You will need to use \\S+
because using .*
will prevent a match even if /V
is present two words ahead.
\\S+
matches non spaces by the way.
Lastly, the final period can be safely ignored.