Search code examples
regexintellij-ideatextnotepad++

How to convert a camelCased variable to lowercase with underscores in Notepad++ or IntelliJ using regular expressions


I have to rename the toString output variables in several hundred files with many occurrences in each. In the most efficient way possible, how could I parse this text:

   .append(", myVariable=").append(myVariable)
   .append(", myOtherVariable=").append(myOtherVariable)
   .append(", mylowervariable=").append(myLowerVariable) // note the left is already lowercase
   .append(", myVarWithURL=").append(myVarWithURL);

and it becomes:

   .append(", my_variable=").append(myVariable)
   .append(", my_other_variable=").append(myOtherVariable)
   .append(", mylowervariable=").append(myLowerVariable) // note the left is already lowercase
   .append(", my_var_with_url=").append(myVarWithURL);

The ones on the right are to remain unchanged, while the ones to the left of the equals sign are to be changed, if they contain uppercase characters.

These will be of arbitrary lengths with a varying number of upper case letters. I was thinking I had to do some sort of lookahead but could not get the replacement value to work correctly.

I have the flexibility of being able to do this in IntelliJ or Notepad++, so I can easily perform the \l \L operators to make a replacement value lowercase.

This was my thought process:

in: myLongCamelCasedVariable

re: ([a-z]+)([A-Z]{1})([a-z]+) // repeat grouping for capturing

       group 1       group 2        group 3         group 4
my + [ L + ong ] + [ C + amel ] + [ C + ased ] + [ V + ariable ]

Is it possible to use a regular expression to effectively capture the various groups of 'text' in the larger text string, and 'loop' over that and apply the output?

Out: $1_\l$2 .... etc

Now I am just stuck


Solution

  • You may use

    Find What: (?:\G(?!\A)|",\h*)\K(\b|[a-z]+)([A-Z]+)(?=\w*=")
    Replace With: $1_\L$2
    Match case: True

    Details:

    • (?:\G(?!\A)|",\h*) - start matching from the end of the previous successful match (\G(?!\A)) or (|) a ", and zero or more horizontal whitespaces (",\h*)
    • \K - remove the text matched so far from the match memory buffer
    • (\b|[a-z]+) - Group 1: word boundary or one or more lowercase letters
    • ([A-Z]+) - Group 2: one or more uppercase letters
    • (?=\w*=") - immediately to the right, there must be zero or more word chars followed with a = char.

    The replacement is $1_\L$2: Group 1, _, and then lowercased Group 2 value.

    See the Notepad++ demo screen:

    enter image description here