pyspark apache-spark-sql azure-databricks databricks-sql

How to convert T-SQL CASE WHEN STATEMENT to Databricks sparkSQL

I am trying to code a typical T-SQL CASE WHEN statement in sparkSQL.

The T-SQL CASE WHEN statement looks as follows:

SELECT
  *
 ,CASE
    WHEN DatesTestv3.as_of_date = EOMONTH(DATEADD(MONTH, -1, SYSDATETIME())) THEN 'LM'
  END AS 'Last Month End'
from dbo.DatesTestv3

My attempt at converting the above T-SQL to sparkSQL is as follows:

spark.sql("select *, case when as_of_date = last_day(add_months(current_date(),-1)) then 'LM' END AS 'Last Month End" from table)

However, I'm getting the ParseException error.

I'm sure it's probably a simple syntax issue.

Therefore, any thoughts on how to convert the T-SQL to sparkSQL?

Solution

As per my knowledge I have tried the Same CASE Statement in the Pyspark at my end.

I have created the table with below columns. id , as_of_date , last_month_end

from pyspark.sql.functions import to_date
spark.sql("CREATE TABLE data020 (id INT, as_of_date DATE, last_month_end STRING)
data = [
(1, "2023-06-01", "MA"),
(2, "2023-06-01", "MA"),
(3, "2023-06-01", "MA"),
(4, "2023-06-01", "MA"),
(5, "2023-06-01", "MA"),
(6, "2023-06-01", "MA"),
(7, "2023-06-01", "MA"),
(8, "2023-06-01", "MA"),
(9, "2023-06-01", "MA"),
(10, "2023-06-01", "MA")
]
df = spark.createDataFrame(data, ["id", "as_of_date", "last_month_end"])
df = df.withColumn("as_of_date", to_date(df.as_of_date))
df.write.mode("overwrite").insertInto("data020")

enter image description here The CASE statement I have tried is some thing like this. I have tried in 2 different ways.

from pyspark.sql.functions import last_day, date_sub, current_date, when
data = spark.table("data_dry")
data = data_dry.withColumn("Last Month End", when(data.as_of_date == last_day(date_sub(current_date(), 1)), "LM").otherwise(None))
data_dry.display()

And

result = spark.sql("""

SELECT *, CASE WHEN as_of_date = last_day(date_add(current_date(), -1)) THEN 'LM' ELSE NULL END AS Last Month End FROM data_dry """)

result.display()

Also you can try with the Date_trunc function too.