I have to replace all the <img>
tags containing a text (dog) but not containing another text (cat), for a multiline text
So having this text:
<img black
dog>
<img dog white cat>
<img black dog>
<img cat and dog>
<img red fox>
<img black dog>
The following texts should be found:
There is a lot of ways to find it for single line regex using ^
and $
, but I am not being able to do it with multiline.
My first attempt was using the single line option (/s
) this way:
/<img ((?!cat).)*?(dog)>/gs
But it select the tag before the last dog (red fox) because is not greedy enough.
And then I made it greedy (adding a ?
) with no /s
option, using \s\S
:
/<img ((?!cat)[\s\S.])*?(dog)?>/g
And I get the fifth tag found again (<img red fox>
) even when there is no dog.
How can I get my 3 dogs selected with no cats or foxes?
Link to my attempt in regex101: https://regex101.com/r/AGgb4z/1
You could match <img
, then assert that there is no cat
using a negative lookahead (?![^<>]*cat)
Use a negated character class [^<>]*
matching any char except <
and >
on the left and the right of dog.
You could use word boundaries for example \bcat\b
if cat and dog should not be part of a longer word.
<img (?![^<>]*cat)[^<>]*dog[^<>]*>