Search code examples
apache-spark-sqldatediff

DATEDIFF in SPARK SQl


I am new to Spark SQL. We are migrating data from SQL server to Databricks. I am using SPARK SQL . Can you please suggest how to achieve below functionality in SPARK sql for the below datefunctions. I can see datediff gives only days in spark sql.

DATEDIFF(YEAR,StartDate,EndDate)
DATEDIFF(Month,StartDate,EndDate) 
DATEDIFF(Quarter,StartDate,EndDate)

Solution

  • As you have mentioned SparkSQL does support DATEDIFF but for days only. I would also be careful as it seems the parameters are the opposite way round for Spark, ie

    --SQL Server
    DATEDIFF ( datepart , startdate , enddate )
    
    --Spark
    DATEDIFF ( enddate , startdate )
    

    Spark does however support a similar function called months_between which you could use in place of DATEDIFF( month .... This function also returns a decimal amount so optionally cast it to INT for similar functionality to the

    SELECT startDate, endDate, 
      DATEDIFF( endDate, startDate ) AS diff_days,
      CAST( months_between( endDate, startDate ) AS INT ) AS diff_months      
    FROM yourTable
    ORDER BY 1;
    

    There are also year and quarter functions for determining the year and quarter of a date respectively. You could simply minus the years but quarters would be more tricky. It may be you have to 'do the math' or end up using a calendar table.