Search code examples
apache-sparkdatetimepysparkapache-spark-sqlapache-spark-3.0

How to get week of month in Spark 3.0+?


I cannot find any datetime formatting pattern to get the week of month in spark 3.0+

As use of 'W' is deprecated, is there a solution to get week of month without using legacy option?

The below code doesn't work for spark 3.2.1

df = df.withColumn("weekofmonth", f.date_format(f.col("Date"), "W"))

Solution

  • you can try using udf:

    from pyspark.sql.functions import col,year,month,dayofmonth
    
    df = spark.createDataFrame(
        [(1, "2022-04-22"), (2, "2022-05-12")], ("id", "date"))
    
    from calendar import monthcalendar
    def get_week_of_month(year, month, day):
        return next(
            (
                week_number
                for week_number, days_of_week in enumerate(monthcalendar(year, month), start=1)
                if day in days_of_week
            ),
            None,
        )
    fn1 = udf(get_week_of_month)
    df =df.withColumn('week_of_mon',fn1(year(col('date')),month(col('date')),dayofmonth(col('date'))))
    display(df)
    

    enter image description here