I am trying to extract the contact link from the HTML code below. I have tried this but doesn't seem to work:
\"([^\"]*)\"(.*?)?\>(Kontakt)
and some part HTML-code:
<li id="cc-nav-view-2315645627" class="jmd-nav__list-item-0">
<a href="/" data-link-title="Start" class="cc-nav-current j-nav-current jmd-nav__link--current">Start</a>
</li>
<li id="cc-nav-view-2315645625" class="jmd-nav__list-item-0">
<a href="/öffnungszeiten-schließzeiten/" data-link-title="Öffnungszeiten & Schließzeiten">Öffnungszeiten & Schließzeiten</a>
</li>
<li id="cc-nav-view-2316315025" class="jmd-nav__list-item-0">
<a href="/flyer/" data-link-title="Flyer">Flyer</a>
</li>
<li id="cc-nav-view-2315732425" class="jmd-nav__list-item-0">
<a href="/anfahrt/" data-link-title="Anfahrt">Anfahrt</a></li>
<li id="cc-nav-view-2315645825" class="jmd-nav__list-item-0">
<a href="/kontakt-termin-verbeinaren/" data-link-title="Kontakt / Termin verbeinaren">Kontakt / Termin verbeinaren</a>
</li>
I need to get the last a href
occurrence with the contact link, but regexp returns the full string.
Check this link.
This expression might help you to design a desired one to do so:
(.*)(<a href=")([A-z0-9-\/]+)(".*)
It swipe from beginning using (.*)
to the last href
, then you can add any boundary that you wish to capture that target URL.
This graph shows how it works:
I'm not so sure, if you want just the URL or the entire tag. If you wish to get the entire tag, then the expression can be simply modified to something similar to:
(.*)((<a href=")(.*)(\<\/a\>))