I'm trying to use RegEx to find a pattern within a pattern. Specifically what I want to do is capture a URL into a reference and search within that for everything that comes after the last = sign and capture that as well.
So given this string
<a href="http://my.domain.com/?s_cid=EM&s_ev9=CMC21892&s_ev10=EM_CMC21892_LC_stuff" style="color: #365EBF:">stuff</a>
I would initially find
href="http://my.domain.com/?s_cid=EM&s_ev9=CMC21892&s_ev10=EM_CMC21892_LC_stuff"
Using this RegEx: href="(https?[^"]*)"
From there I could parse the actual string (when looking at the captured group) I'm looking for EM_CMC21892_LC_stuff
with this: =[^"=]*$
I am having no success though when I try to combine the two to accomplish it in one RegEx.
Any thoughts?
He's right, using regexes to parse HTML is just asking for trouble.
That said, try href="http[^"]+=([^"]+?)"
.