Can someone please explain me how the below epoch time
How is the conversion taking place and additionally how to identify an epoch timestamp if it is mentioned in seconds, milliseconds, microseconds or nanoseconds?
Additionally, is there a function in pyspark to convert the date back to epoch timestamp?
Thanks! in advance.
I tried a number of methods but I am not achieving the expected result:
t = datetime.datetime.strptime('2021-11-12 02:12:23', '%Y-%m-%d %H:%M:%S')
print(t.strftime('%s'))
As I am not able to control the format or accuracy in terms of seconds, milliseconds, microseconds or nanoseconds.
The epoch time/unix-timestamp uses a reference date: 00:00:00 UTC on 1 January 1970. It counts the seconds/milliseconds from that date.
The value you are looking for is in miliseconds, so you would have to calculate the milliseconds and concatenate with the epoch time:
import pyspark.sql.functions as F
df = spark.createDataFrame([('2022-11-14T16:25:21.840+0000',)]).toDF("timestamp")\
df\
.withColumn("timestamp",F.to_timestamp(F.col("timestamp")))\
.withColumn("epoch_seconds",F.unix_timestamp("timestamp"))\
.withColumn("epoch_miliseconds",F.concat(F.unix_timestamp("timestamp"), F.date_format("timestamp", "S")))\
.show(truncate=False)
# +----------------------+-------------+-----------------+
# |timestamp |epoch_seconds|epoch_miliseconds|
# +----------------------+-------------+-----------------+
# |2022-11-14 16:25:21.84|1668443121 |16684431218 |
# +----------------------+-------------+-----------------+