Search code examples
regexregex-greedy

Ungreedy with look behind


I have this kind of text:

other text opt1 opt2 opt3 I_want_only_this_text because_of_this

And am using this regex:

(?<=opt1|opt2|opt3).*?(?=because_of_this)

Which returns me:

opt2 opt3 I_want_only_this_text

However, I want to match only "I_want_only_this_text".

What is the best way to achieve this?

I don't know in what order the opt's will appear and they are only examples. Actual words will be different and there will be more of them.

Test screenshot

Actual data: regex

(?<=※|を|備考|町|品は|。).*(?=のお届けとなります|でお届けします|にてお届け致します|にてお届けいたします)

text

こだわり豚には通常の豚よりビタミンB1が2倍以上あります。私たちの育てた愛情たっぷりのこだわり豚をぜひ召し上がってください。商品説明名称えびの産こだわり豚切落し産地宮崎県えびの市内容量500g×8パック合計4kg賞味期限90日保存方法-15℃以下で保存すること提供者株式会社さつま屋産業備考・本お礼品は冷凍でのお届けとなります

what I want to get:

冷凍で


Solution

  • You can use

    (?<=※|を|備考|町|品は|。)(?:(?!※|を|備考|町|品は|。).)*?(?=のお届けとなります|でお届けします|にてお届け致します|にてお届けいたします)
    

    See the regex demo. The scheme is the same as in (?<=opt1|opt2|opt3)(?:(?!opt1|opt2|opt3).)*?(?=because_of_this) (see demo).

    The tempered greedy token solution allows you to match multiple occurrences of the same pattern in a longer string.

    Details

    • (?<=※|を|備考|町|品は|。) - a positive lookbehind that matches a location that is immediately preceded with one of the alternatives listed in the lookbehind
    • (?:(?!※|を|備考|町|品は|。).)*? - any char other than a line break char, zero or more but as few as possible occurrences, that is not a starting point of any of the alternative patterns in the negative lookahead
    • (?=のお届けとなります|でお届けします|にてお届け致します|にてお届けいたします) - a positive lookahead that requires one of the alternative patterns to appear immediately to the right of the current location.