Search code examples
pythondatetimepysparkepoch

Converting long epoch timestamp into date time in PySpark


I have a spark dataframe with the following schema:

root
 |-- var1: long (nullable = true)
 |-- var2: long (nullable = true)
 |-- var3: long (nullable = true)
 |-- y_timestamp: long (nullable = true)
 |-- x_timestamp: long (nullable = true)

How do I convert the timestamps into a readable date time variable?

It currently looks like: 1561360513087


Solution

  • Using withColumn when creating the dataframe, you can convert the timestamp (in milliseconds) to seconds, and then convert it to a timestamp.

    .withColumn("x_timestamp", spark_fns.expr("from_unixtime(x_timestamp/1000, 'yyyy-MM-dd')")