Search code examples
phpregexpreg-match

Regex: find last occurance in a string preceded by another string


I have a pretty huge string (a big chunk of html), in which I'd like to find a chunk according to this scenario:

<h2>Some text here</h2>
<p>Lorem ipsum... Lorem ipsum... String1... Lorem ipsum...</p>
<p>More Lorem ipsum... More Lorem ipsum...</p>

<h2>Some more text here</h2>
<p>Lorem ipsum... Lorem ipsum... String2... Lorem ipsum...</p>
<p>More Lorem ipsum... More Lorem ipsum...</p>

<h2>Another chunk here, same string</h2>
<p>Lorem ipsum... Lorem ipsum... String2... Lorem ipsum...</p>
<p>More Lorem ipsum... More Lorem ipsum...</p>

<h2>And even more text here</h2>
<p>Lorem ipsum... Lorem ipsum... String3... Lorem ipsum...</p>
<p>More Lorem ipsum... More Lorem ipsum...</p>

I'd like to find the last chunk, starting with the h2 and ending before the next h2, and which includes "String2", which in the example above would be

<h2>Another chunk here, same string</h2>
<p>Lorem ipsum... Lorem ipsum... String2... Lorem ipsum...</p>
<p>More Lorem ipsum... More Lorem ipsum...</p>

Can anybody help me with this? I use PHP's preg-flavour of RegEx. I get stuck after the

<h2(.*+)String2<h2/im

and cannot get my head around how to find the last one only.

Thanks!


Solution

  • You can use an approach like this:

    <h2>(?:[^\0](?!<h2>))*?String2[^\0]*?(?=<h2>)
    

    Regex live here.