Search code examples
regexscalareplaceall

Selectively uppercasing a string


I have a string with some XML tags in it, like:

"hello <b>world</b> and <i>everyone</i>"

Is there a good Scala/functional way of uppercasing the words, but not the tags, so that it looks like:

"HELLO <b>WORLD<b> AND <i>EVERYONE</i>"

Solution

  • We can use dustmouse's regex to replace all the text in/outside XML tags with Regex.replaceAllIn. We can get the matched text with Regex.Match.matched which then can easily be uppercased using toUpperCase.

    val xmlText = """(?<!<|<\/)\b\w+(?!>)""".r
    
    val string = "hello <b>world</b> and <i>everyone</i>"
    xmlText.replaceAllIn(string, _.matched.toUpperCase)
    // String = HELLO <b>WORLD</b> AND <i>EVERYONE</i>
    
    val string2 = "<h1>>hello</h1> <span>world</span> and <span><i>everyone</i>"
    xmlText.replaceAllIn(string2, _.matched.toUpperCase)
    // String = <h1>>HELLO</h1> <span>WORLD</span> AND <span><i>EVERYONE</i>
    

    Using dustmouse's updated regex :

    val xmlText = """(?:<[^<>]+>\s*)(\w+)""".r
    
    val string3 = """<h1>>hello</h1> <span id="test">world</span>"""
    xmlText.replaceAllIn(string3, m => 
      m.group(0).dropRight(m.group(1).length) + m.group(1).toUpperCase)
    // String = <h1>>hello</h1> <span id="test">WORLD</span>