Search code examples
stringscalaapache-sparkapache-spark-sqlmultiple-columns

Comparing string values of two columns in scala


I have two string columns, x and y, in a dataframe that i want to compare. If they are the same (case insensitive), I want to return x and if they are different, I want to concatenate x and y. I know I can do this in SQL but i'm trying to do it in scala

select (case when x = y then x else concat(x + ". " + y) end) as match from test

in scala

df.select(when(col("x") == col("y"), col("x") )
        .otherwise(concat(col("x"),lit('. '), col("y")))
        .as("match"))

I get the following error when i test

error: type mismatch; found : Boolean required: org.apache.spark.sql.Column


Solution

  • Use === instead of == for equality checks on scala spark.

      df.select(when(col("x") === col("y"), col("x") )
            .otherwise(concat(col("x"),lit('. '), col("y")))
            .as("match"))