This question is similar to another one I asked here: Match strings between delimiting characters but I could not modify in order to perform the new task. (Solution should work with EmEditor or Notepad++)
I need to match text between specific tags, I.e. <b class="b2">I have a lot of text, more text, some more text, text</b>
and then
I have tried running a number of regexes to get close to this with multiple steps, i.e.
(<b class="b2">)(.)
[[\L\2
</b>
]]
(\[\[)(\w+), (\w+)(\]\])
\1\2]], [[\3\4
Input text:
Any text <b class="b2">I make laugh</b>: Ar. and P. γέλωτα. Some more text <b class="b2">Delight</b>: P. and V. [[τέρπω]].
Any text <b class="b2">I amuse oneself, pass the time</b>: P. διάγειν.
Any text <b class="b2">It amuses oneself with, pass the time over, amuse</b>: Ar. and P.
Expected output:
Any text [[I make laugh]]: Ar. and P. γέλωτα. Some more text [[delight]]: P. and V. [[τέρπω]].
Any text [[I amuse oneself]], [[pass the time]]: P. διάγειν.
Any text [[it amuses oneself with]], [[pass the time over]], [[amuse]]: Ar. and P.
This a one-step solution:
(?:<b class="b2">|\G(, (?=.*</b>)))(I )?([^,<]+)(?:</b>)?
$1[[$2\l$3]]
. matches newline
Explanation:
(?: # non capture group
<b class="b2"> # literally
| # OR
\G # restart from last match position
( # group 1, a comma and a space
, # a comma and a space
(?=.*</b>) # positive look ahead, make sure we have a closing tag after
) # end group 1
) # end group
(I )? # group 2, UPPER I and a space, optional
([^,<]+) # group 3, 1 or more any character that is not comma or less than
(?:</b>)? # optional end tag
Replacement:
$1 # content og group 1 (i.e. comma & space)
[[ # double openning square bracket
$2 # content of group 2, (i.e. "I ")
\l$3 # lowercase the first letter of group 3 (i.e. all character until comma or end tag)
]] # double closing square bracket
Result for given example:
Any text [[I make laugh]]: Ar. and P. γέλωτα. Some more text [[delight]]: P. and V. [[τέρπω]].
Any text [[I amuse oneself]], [[pass the time]]: P. διάγειν.
Any text [[it amuses oneself with]], [[pass the time over]], [[amuse]]: Ar. and P.
[[be at ease]], v.: P. and V. ἡσυχάζειν, V. ἡσύχως ἔχειν.
Screen capture: