Search code examples
regexregex-lookarounds

Find first occurrence of a character before a keyword with Regex


I'm trying to extract each first occurrence of a character before a keyword. In the string below, the idea would be to extract the position of ">" just for lines that contain the word "Change", and should match 3 ">" on the 3 last lines.

<th class="col_heading level0 col5" >Thing</th>
<th class="col_heading level0 col6" >Second Thing</th>
<th class="col_heading level0 col7" >Third Thing</th>

<th class="col_heading level0 col5" >Thing Change</th>
<th class="col_heading level0 col6" >Second Thing Change</th>
<th class="col_heading level0 col7" >Third Thing Change</th>

I have a begin of answer, using look-alike operator, I'm currently able to extract text from the ">" to the "Change" Keyword, but I'm stuck to extract just the ">" properly

([^"]*(?=Change))

Solution

  • Use

    >(?=.*?Change)
    

    See regex proof.

    EXPLANATION

    --------------------------------------------------------------------------------
      >                        '>'
    --------------------------------------------------------------------------------
      (?=                      look ahead to see if there is:
    --------------------------------------------------------------------------------
        .*?                      any character except \n (0 or more times
                                 (matching the least amount possible))
    --------------------------------------------------------------------------------
        Change                   'Change'
    --------------------------------------------------------------------------------
      )                        end of look-ahead