Search code examples
apache-sparkdatepysparkapache-spark-sqlazure-databricks

Change the day of the date to a particular day


I basically have a requirement that needs a column that as the PeriodEndDate in. The period always ends on the 23rd of the month.

I need to take a date from a column in this case it is the last day of the month each day, and set the "day" of that date to be "23".

I have tried doing the following:

.withColumn("periodEndDate", change_day(jsonDF2.periodDate, sf.lit(23)))

cannot import name 'change_day' from 'pyspark.sql.functions'


Solution

  • You can use make_date

    from pyspark.sql import functions as F
    df = spark.createDataFrame([('2022-05-31',)], ['periodDate'])
    
    df = df.withColumn('periodEndDate', F.expr("make_date(year(periodDate), month(periodDate), 23)"))
    
    df.show()
    # +----------+-------------+
    # |periodDate|periodEndDate|
    # +----------+-------------+
    # |2022-05-31|   2022-05-23|
    # +----------+-------------+