Search code examples
phpregexpreg-matchhref

preg_match a <link> href


I'm trying to do something I thought would be simple, but no luck. The goal is to grab the href value from any tag. Example:

Source Material:

<link href="http://www.somesite.com/test.css" rel="stylesheet" type="text/css">

RegEx attempting:

<link[^>]*href=["{1}](.*?)["{1}][^>]*>

It seems valid at http://regexpal.com/, but I'm trying it at http://www.solmetra.com/scripts/regex/index.php, however, and it isn't working.

Any ideas?


Solution

  • Looks like you have the {1} inside a character class [] when it should really follow after. Actually, it isn't even necessary since it is implicit. But instead, you should use [^"] to match everything up to the next quote:

    <link[^>]*href="([^"]*)"[^>]*>
    

    Note: You're only attempting to match double-quoted href attributes. This will require modification if you expect to encounter any single-quoted attributes.

    Obligatory public service announcement: It is better to use a proper HTML parsing library to parse HTML and retrieve attributes than to try parsing it with regular expressions.