Search code examples
regexnotepad++

Find contents of parentheses that start with UPPERCASE letter?


I want to grab the contents of parentheses, excluding the parentheses. Then add a colon to the end plus font items shown below.

Before: (Woman 1), (Ki-Woo), (Drunk)

After: <font color="#FF4500"><b>Woman 1:</b></font>

Here's what I have so far:

Find: (\([A-Z]*(?:(\h*|-)[A-Z0-9][a-z]*)*\))

Replace: \<font color\=\"\#FFA500\"\>\<b\>($1)\:\<\/b\>\<\/font\>

Currently mine still includes the brackets in the Find.


Solution

  • Using the pattern that you tried, you can switch the capture group to the inner part of the parenthesis, and make the second capture group a non capturing one.

    Then use group 1 in the replacement.

    \(([A-Z]*(?:(?:\h*|-)[A-Z0-9][a-z]*)*)\)
    

    Regex demo

    To not match empty parenthesis, you might write the pattern as:

    \(([A-Z][a-z]*(?:(?:\h*|-)[A-Z0-9][a-z]*)*)\)
    
    • \( Match (
    • ( Capture group 1
      • [A-Z][a-z]* Match a single char A-Z and optional chars a-z
      • (?: Non capture group to repeat as a whole part
        • (?:\h*|-) Match either 0+ horizontal whitspace chars or a single -
        • [A-Z0-9][a-z]* Match a single char A-Z and optional chars a-z
      • )* Close and repeat the non capture group 0+ times
    • ) Close group 1
    • \) Match )

    Regex demo

    Note that this part of the pattern (?:\h*|-) can match either optional horizontal whitespace chars, or a single hyphen, and matches Ki Woo but could also match KiWoo

    In the replacement use group 1 using $1

    <font color="#FF4500"><b>$1</b></font>
    

    enter image description here