I want to match any words contained between double brackets and add an extra character (using Notepad++ or EmEditor). For example the following text:
[[test]], look at my [[test|testing]] how it tests; I have a [[test]],you know a [[test, and more test]] and [[there's another test you know]]
Should become
[[test한]], look at my [[test한|testing한]] how it tests; I have a [[test한]],you know a [[test한, and한 more한 test한]] and [[there한's한 another한 test한 you한 know한]]
So far I can only match the full content: \[\[.*?\]\]
A strict regex version will be
(?:\G(?!\A)|\[\[)(?:(?!\[\[|]]).)*?\K\w+(?=.*?]])
This regex finds any one or more word characters only inside [[
and ]]
. See the regex demo.
A less stricter pattern will be
(?:\G(?!\A)|\[\[)(?:(?!\[\[|]]).)*?\K\w+
Note the missing lookahead at the end. This regex finds any one or more word characters only inside [[
and ]]
or just between [[
and end of the string.
See this regex demo.
If your text contains only well balanced brackets, you may go for a regex that will match any word before any zero or more chars other than brackets followed with ]]
:
\w+(?=[^][]*]])
See this regex demo.
The replacement will be $0한
in all three cases where $0
represents the whole match value.
Pattern details
(?:\G(?!\A)|\[\[)
- either the end of the preceding match ((?!\A)
excludes start of string position from \G
) or [[
(?:(?!\[\[|]]).)*?
- any one char (other than line break chars if .
matches newline is OFF, else including newlines), zero or more but as few as possible occurrences, that is not the starting point for [[
or ]]
char sequences (thus, matching is done only between [[
and ]]
but this alone does not require ]]
to be right there)\K
- a match reset operator that discards the text matched so far from the overall match value\w+
- one or more word chars(?=.*?]])
- a positive lookahead that requires zero or more chars other than line break chars (if .
matches newline option is OFF, else including newlines) as few as possible and then ]]
.