Search code examples
regexscalaapache-spark-sqldatabricksregexp-replace

error overloaded method value regexp_replace with alternatives


I'm trying to replace the "/" character with space(" ") from data in a column called UserAgent in a dataframe df_test

Data in the column looks like this:

Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko

I have tried using

val df_test =spark.sql(s"select UserAgent from df_header_pivot")
 df_test.withColumn("UserAgent", regexp_replace("UserAgent", "[/]", ""))

but I'm getting error message:

notebook:4: error: overloaded method value regexp_replace with alternatives: (e: org.apache.spark.sql.Column,pattern: org.apache.spark.sql.Column,replacement: org.apache.spark.sql.Column)org.apache.spark.sql.Column (e: org.apache.spark.sql.Column,pattern: String,replacement: String)org.apache.spark.sql.Column cannot be applied to (org.apache.spark.sql.ColumnName, org.apache.spark.sql.Column) df_test.withColumn("UserAgent", regexp_replace($"UserAgent" , lit("/")))


Solution

  • You need to use the $ symbol before the column name in regexp_replace function. import org.apache.spark.sql.functions._ val df_test =spark.sql(s"select UserAgent from df_header_pivot") df_test.withColumn("UserAgent", regexp_replace($"UserAgent", "[/]", " "))