Search code examples
javascriptregex

Regular expression isn't matching all hyperlinks in a string


I have some text:

const str = `This <a href="https://regex101.com/" data-link-id="431ebea7-1426-65a5-8383-55a27313dc51">is a test link</a> which has a hyperlink, and <a href="https://regex102.com/" data-link-id="d62dc3eb-7b3d-953e-4d7a-987448e6928d">this is also</a> a hyperlink.`

I'm trying to match all a tags, but my regular expression just returns the whole thing:

str.match(/<a href=".+ data-link-id="[0-9A-Z-a-z]{1,}">(.*?)<\/a>/)

What am I doing wrong here? I expect the result to be an array of two elements. Instead of (.*?), I've tried .+ and [A-Za-z0-9\s]+, same result.


Solution

  • Your current regex pattern has one slight bug in it, which is that it uses href=.+ as part of matching the anchor tag. The .+ is problematical because it is greedy, and will match across all anchors until the last one. If you instead use .+? it will behave as you want.

    var str = 'This <a href="https://regex101.com/" data-link-id="431ebea7-1426-65a5-8383-55a27313dc51">is a test link</a> which has a hyperlink, and <a href="https://regex102.com/" data-link-id="d62dc3eb-7b3d-953e-4d7a-987448e6928d">this is also</a> a hyperlink.';
    var matches = str.match(/<a href=".+? data-link-id="[0-9A-Z-a-z]{1,}">(.*?)<\/a>/g);
    console.log(matches);

    Note also that you should use the global /g flag with match() to get all matches.