Search code examples
pythonpandasdatetimeutc

Access Pandas Dataframe based on current (now) minute


I have a Dataframe with one row per minute. I need to access the row corresponding to the current minute

                          value
2022-01-12 11:27:24+01:00     a
2022-01-12 11:28:41+01:00     b
2022-01-12 11:29:36+01:00     c
2022-01-12 11:30:11+01:00     d
2022-01-12 11:31:03+01:00     e
2022-01-12 11:32:39+01:00     f

I have to match just the minute. I tried finding the current time both with pandas and datetime (code to reproduce it):

import pandas as pd
import numpy as np
import string
import datetime
start_idx=(datetime.datetime.now()).strftime(format="%Y-%m-%d %H:%M")
end_idx=(datetime.datetime.now()+datetime.timedelta(minutes=+5)).strftime(format="%Y-%m-%d %H:%M")

index_today = pd.date_range(start=start_idx, end=end_idx, freq='1T',tz='Europe/Rome')
# create random seconds
index_today=[i+ pd.DateOffset(seconds=np.random.randint(0,59)) for i in index_today]
df = pd.DataFrame(index=index_today, data=list(string.ascii_lowercase[0:len(index_today)]),columns=['value'])


now_pandas = pd.to_datetime("now").round(freq='min').tz_localize('utc').tz_convert('Europe/Rome')
now_datetime = datetime.datetime.now().strftime(format="%Y-%m-%d %H:%M")

out_pandas=df.loc[df.index.floor('Min')==now_pandas, :]
out_datetime=df.loc[now_datetime, :]

print('now pandas is ',now_pandas)
print('now datetime is ',now_datetime)
print('Current value found with Pandas:\n',out_pandas)
print('Current value found with datetime\n',out_datetime)

But sometimes they give different results:

now pandas is  2022-01-12 11:46:00+01:00
now datetime is  2022-01-12 11:45
Current value found with Pandas:
                           value
2022-01-12 11:46:08+01:00     b
Current value found with datetime
                           value
2022-01-12 11:45:35+01:00     a

What is the best and most robust way to do it?

Also, I noticed that if the dataframe is not tz-aware, then pd.to_datetime("now") gives the time in utc and I need to localize it, convert it and then turn it back to tz-naive. Any solution for that?

Thank you very much!!


Solution

  • How to select row(s) based on the current minute? Make sure to set the condition correctly (as intended), e.g. by flooring the current time to the minute (clip to minute resolution). Ex:

    import pandas as pd
    import numpy as np
    
    tz = 'Europe/Rome'
    now = pd.Timestamp.now(tz)
    print(now)
    # 2022-01-12 12:11:38.796675+01:00
    
    idx = pd.date_range(now.floor('d'), now.ceil('d'), freq='min')
    df = pd.DataFrame(index=idx, data=np.random.randint(0, 5, size=idx.size), columns=['value'])
    
    out_pandas = df.loc[df.index.floor('min') == now.floor('min'), :]
    print(out_pandas)
    #                            value
    # 2022-01-12 12:11:00+01:00      1