val df = Seq("2019-07-30", "2019-08-01").toDF
val dd = df.withColumn("value", to_date('value))
dd.show(false)
according to the docs https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html
F
is the format string if I need to see the day of the week in month. And
dd.withColumn("dow", date_format('value, "EEEE")).withColumn("dow_number", date_format('value, "F")).show(false)
+----------+--------+----------+
|value |dow |dow_number|
+----------+--------+----------+
|2019-07-30|Tuesday |5 |
|2019-08-01|Thursday|1 |
+----------+--------+----------+
gives only the day of the week in the month, not the day of the week.
Which format string gives me the day of the week as a number /integer?
Obviously, I could use: http://www.java2s.com/Tutorials/Java/Data_Type_How_to/Date/Get_day_of_week_int_value_and_String_value.htm
But do not want to go for a UDF / want to use the catalyst optimized date_format
. So which date format string gives me the desired result?
As mentionned in the comments, you are looking for the "u"
format.
Also, from spark 2.3.0 you might want to use dayofweek
method, which is faster dayofweek documentation