I am using spark scala with DataFrame API, trying to convert the below sql logic
CASE
WHEN col_1 like '%XYZ' OR col_1 like '%ZYX' THEN
CASE WHEN col_2 like '%TTT' THEN 'ABC' ELSE 'BBA' END
WHEN col_1 not like '%XYZ' OR col_1 not like '%ZYX'
CASE WHEN col_2 like '%YYY' THEN BBC' END
END as new_col
How to construct CASE WHEN with multiple like and not like conditions with spark scala dataframe api?
Use the expr function and pass the whole case statement in it as below.
import org.apache.spark.sql.functions._
val df=Seq(
("A","01/01/2022",1), ("AXYZ","02/01/2022",1), ("AZYX","03/01/2022",1),("AXYZ","04/01/2022",0), ("AZYX","05/01/2022",0),("AB","06/01/2022",1), ("A","07/01/2022",0) ).toDF("Category", "date", "Indictor")
df.select(col("*"),expr("""CASE WHEN Category like '%XYZ' OR Category like '%ZYX' THEN
CASE WHEN Indictor = 1 THEN 'ABC' ELSE 'BBA' END
WHEN Category not like '%XYZ' OR Category not like '%ZYX' then
CASE WHEN Indictor = 1 THEN 'BBC' ELSE 'BBD' END
END""").alias("new_col")).show()
+--------+----------+--------+-------+
|Category| date|Indictor|new_col|
+--------+----------+--------+-------+
| A|01/01/2022| 1| BBC|
| AXYZ|02/01/2022| 1| ABC|
| AZYX|03/01/2022| 1| ABC|
| AXYZ|04/01/2022| 0| BBA|
| AZYX|05/01/2022| 0| BBA|
| AB|06/01/2022| 1| BBC|
| A|07/01/2022| 0| BBD|
+--------+----------+--------+-------+