Search code examples
scalareplaceall

Replacing special characters in scala


I am working on a scala and want to replace special characters from my dataframe replaceAll doesn't seem to work, is there any other way ?

My code is this:

val specialchar = dataframe.select(column).replaceAll("[^A-za-z]+","")

Solution

  • You can provide the allowed characters in regex .

    Try following

     val badDF = Seq(("7369", "SMI_)(TH" , "2010-12-17", "800.00"), ("7499", "AL@;__#$LEN","2011-02-20", "1600.00")).toDF("empno", "ename","hire_date", "sal")
     val cleanedDF = badDF.select(badDF.columns.map(c => regexp_replace(badDF(c), """[^A-Z a-z 0-9]""", "").alias(c)): _*)
     cleanedDF.show
    

    ename contains special characters. above regex will only allow Capital/Small a-z characters and 0-9 digits. All other characters will be removed.

    enter image description here