Search code examples
phpregextext-extraction

Get href value between two substrings


I am trying to capture the following pattern:

<a href="http://cdn.xyz.com/media/info.pdf" target="_blank">

This is what I am trying:

preg_match_all( '/(<[a-zA-Z]+[^>]+>)/ism', $str, $matches);

This is not capturing the above pattern.

How should I restructure the pattern?


Solution

  • You could to use a negative lookahead assertion based regex.

    preg_match_all('~<[a-zA-Z]+(?:(?!&[lg]t;).)*>~isg', $str, $matches);
    

    (?:(?!&[lg]t;).)* matches any character but not of < or >. That is, it checks whether the character going to be matched won't be the starting letter in < or >.

    OR

    <[a-zA-Z]+.*?>
    

    DEMO