I have a piece of Spark code that looks like this:
df //existing dataframe
.withColumn("input_date", lit("20190105"))
.withColumn("input_date_epoch", unix_timestamp(col("input_date"), "YYYYMMdd"))
Now, when I run a df.describe
the data returned shows the input_date_epoch
column having all values as 1546128000
, which when I run through an epoch converter comes out as 2018-12-30 00:00:00, rather than the expected value of 2019-01-05 00:00:00
Am I doing something wrong here?
The pattern is wrong, if you want a year with four digits, use yyyy
:
spark.range(5)
.withColumn("input_date", lit("20190105"))
.withColumn("input_date_epoch", unix_timestamp(col("input_date"), "yyyyMMdd"))
YYYYY
actually refers to weekyear, see the documentation