Search code examples
functionpysparktimestamphourminute

Is there any functions in PySpark to give the "minute of day"?


I have a column with Timestamp. I am wondering if there is a function that can give me the "minute of day" for each Timestamp. I am looking for a function to give me an integer as an answer which shows how many minutes have passed from 00:00 which is beginning of day. For example, Timestamp of 00:15 should become 15 or Timestamp of 01:05 should become 65 or Timestamp of 03:15 should become 195. (Basically it should do HH*60 + MM)

In the link below I could find a function which shows "day of year" But I could not find any functions for "minute of day"

https://stackoverflow.com/a/30956282/12305290

Thank you in advance!


Solution

  • Combine the PySpark SQL functions hour and minute the same way you suggested it:

    In [1]: df = spark.createDataFrame([('2015-04-08 13:08:15',)], ['ts'])
    df.
    In [2]: from pyspark.sql.functions import hour, minute
    
    In [3]: df.withColumn("minutes_since_midnight", hour(df.ts)*60 + minute(df.ts)).show()
    +-------------------+----------------------+
    |                 ts|minutes_since_midnight|
    +-------------------+----------------------+
    |2015-04-08 13:08:15|                   788|
    +-------------------+----------------------+