Search code examples
regexscala

How to remove square brackets and backticks from a string in Scala using regex replaceAll


I was trying to replace [ & ] from a string of formula as:

col_formula:regexp_replace( regexp_replace([`cellid`], "(.*)_N", "N"), "_(.*)", "")
 
var replaced_col_formula= col_formula.replaceAll("/[\\[\\]']+/g", "")
println(s"replaced_col_formula:$replaced_col_formula")
 
replaced_col_formula:regexp_replace( regexp_replace([`cellid`], "(.*)_N", "N"), "_(.*)", "")

I was expecting something like below

replaced_col_formula:regexp_replace( regexp_replace(cellid, "(.*)_N", "N"), "_(.*)", "")

Solution

  • First of all, you should only use a string pattern ("example_pattern") in Spark when using .regex_replace, not a regex literal (/example_pattern/) notation.

    Also, regexp_replace replaces all occurrences of a match in the input string by default, so you should not seek a way to pass any kind of "global" flags into the regex.

    So, you may use

    .replaceAll("\\[`(.*?)`]", "$1")
    

    See the regex demo.

    Details

    • \[` - a [` substring
    • (.*?) - Group 1 (later referred to with $1 replacement backreference): any zero or more chars (other than line break chars) as few as possible
    • `] - a `] substring.