Search code examples
swiftregexhref

Match and extract href info using regex


I am trying to make a regex that match and extract href link information in more than one case, for example both with double, single and no quotation mark in Swift.

A regex to match href and extract info <a href=https://www.google.com>Google</a>.
<a href="https://www.google.com">Google</a> 
<a href='https://www.google.com'>Google</a>

I have found this regex, but it only works with double quotation:

<a href="([^"]+)">([^<]+)<\/a>

Result:

Match 1: <a href="https://www.google.com">Google</a>
Group 1: https://www.google.com
Group 2: Google

What I want is to detect all of the three ways that I provided with the sample text.

Note: I know that regex shouldn't be used for parsing HTML, but I am using it for a very small use case so it's fine.


Solution

  • assuming there is no other attribute in anchor tags in the file you wish to parse, you can use the following regex : /<a href=('|"|)([^'">]+)\1>([^<]+)<\/a>/$2 $3/gm.

    It first captures either single quote, double quote or nothing and then \1 recalls that capturing group, watch it live here on regex101.