Search code examples
javaregexstringuppercaselowercase

Use Java and RegEx to convert casing in a string


Problem: Turn

"My Testtext TARGETSTRING My Testtext" 

into

"My Testtext targetstring My Testtext"

Perl supports the "\L"-operation which can be used in the replacement-string.

The Pattern-Class does not support this operation:

Perl constructs not supported by this class: [...] The preprocessing operations \l \u, \L, and \U. https://docs.oracle.com/javase/10/docs/api/java/util/regex/Pattern.html


Solution

  • You can't do this in Java regex. You'd have to manually post-process using String.toUpperCase() and toLowerCase() instead.

    Here's an example of how you use regex to find and capitalize words of length at least 3 in a sentence

        String text = "no way oh my god it cannot be";
        Matcher m = Pattern.compile("\\b\\w{3,}\\b").matcher(text);
    
        StringBuilder sb = new StringBuilder();
        int last = 0;
        while (m.find()) {
            sb.append(text.substring(last, m.start()));
            sb.append(m.group(0).toUpperCase());
            last = m.end();
        }
        sb.append(text.substring(last));
    
        System.out.println(sb.toString());
        // prints "no WAY oh my GOD it CANNOT be"
    

    Note on appendReplacement and appendTail

    Note that the above solution uses substring and manages a tail index, etc. In fact, you can go without these if you use Matcher.appendReplacement and appendTail.

        StringBuffer sb = new StringBuffer();
        while (m.find()) {
            m.appendReplacement(sb, m.group().toUpperCase());
        }
        m.appendTail(sb);
    

    Note how sb is now a StringBuffer instead of StringBuilder. Until Matcher provides StringBuilder overloads, you're stuck with the slower StringBuffer if you want to use these methods.

    It's up to you whether the trade-off in less efficiency for higher readability is worth it or not.

    See also