Search code examples
scalaapache-sparkapache-spark-sqldate-format

day of week date format string java inside spark


val df = Seq("2019-07-30", "2019-08-01").toDF
val dd = df.withColumn("value", to_date('value))
dd.show(false)

according to the docs https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html

F is the format string if I need to see the day of the week in month. And

dd.withColumn("dow", date_format('value, "EEEE")).withColumn("dow_number", date_format('value, "F")).show(false)

+----------+--------+----------+
|value     |dow     |dow_number|
+----------+--------+----------+
|2019-07-30|Tuesday |5         |
|2019-08-01|Thursday|1         |
+----------+--------+----------+

gives only the day of the week in the month, not the day of the week.

Which format string gives me the day of the week as a number /integer?

Obviously, I could use: http://www.java2s.com/Tutorials/Java/Data_Type_How_to/Date/Get_day_of_week_int_value_and_String_value.htm But do not want to go for a UDF / want to use the catalyst optimized date_format. So which date format string gives me the desired result?


Solution

  • As mentionned in the comments, you are looking for the "u" format.

    Also, from spark 2.3.0 you might want to use dayofweek method, which is faster dayofweek documentation