I am trying to capture the following pattern:
<a href="http://cdn.xyz.com/media/info.pdf" target="_blank">
This is what I am trying:
preg_match_all( '/(<[a-zA-Z]+[^>]+>)/ism', $str, $matches);
This is not capturing the above pattern.
How should I restructure the pattern?
You could to use a negative lookahead assertion based regex.
preg_match_all('~<[a-zA-Z]+(?:(?!&[lg]t;).)*>~isg', $str, $matches);
(?:(?!&[lg]t;).)*
matches any character but not of <
or >
. That is, it checks whether the character going to be matched won't be the starting letter in <
or >
.
OR
<[a-zA-Z]+.*?>