Search code examples
regexnotepad++camelcasingsnakecasing

All text from camelCase to SNAKE_CASE


I am trying to do some text manipulations using Notepad++ macros. My last step is converting camelCase strings to SNAKE_CASE. So far no luck. I'm not very familiar with regex so can't write my own solution.

Example text file input:

firstLine(874),
secondLine(15),
thirdLineOfText87(0x0001);

Desired output:

FIRST_LINE(874),
SECOND_LINE(15),
THIRD_LINE_OF_TEXT_87(0x0001);

Regex or any plugin is an acceptable answer.


Solution

  • I suggest the following regex approach:

    Find What:      (\b[a-z]+|\G(?!^))((?:[A-Z]|\d+)[a-z]*)
    Replace With: \U\1_\2
    Match Case: ON.

    This will turn camelCase87LikeThis words to CAMEL_CASE_87_LIKE_THIS. If you need to add support for those camel words that start with an uppercase letter, use the following regex modification:

    (\G(?!^)|\b[a-zA-Z][a-z]*)([A-Z][a-z]*|\d+)
    

    See the regex demo (also tested in Notepad++). Note the placement of the \G inside the regex and added A-Z.

    Details:

    • (\b[a-z]+|\G(?!^)) - Group 1 capturing either of the two alternatives:
      • \b[a-z]+ - start of a word (\b is the initial word boundary here) followed with 1+ lowercase ASCII letters
      • |- or
      • \G(?!^) - the end position of the previous successful match
    • ((?:[A-Z]|\d+)[a-z]*) - Group 2 capturing:
      • (?:[A-Z]|\d+) - either an uppercase ASCII letter ([A-Z]) or (|) 1+ digits (\d+)
      • [a-z]* - 0+ lowercase ASCII letters.

    The \U\1_\2 replacement pattern turns all the chars to uppercase with \U and inserts a _ between the two groups (inserted with \1 and \2 backreferences).

    enter image description here