I have a df of minutely prices and want to establish if there are minutes missing (across a 5 year period). The price is only stamped when there is a transaction so there are some missing minutes.
There are 4 entities in a different column and I would like to know the entity that is missing the minute as well as when it was.
My first inclination is to resample and sum NaNs. What is the best way of doing this?
Until there is a better answer here is how I have dealt with this. Merge with the nearest minute using pandas
Write the answer from this question out with the addition of printing all NaN values.
df_time = pd.DataFrame({'date':pd.date_range(start='yyyy/mm/dd',end='yyyy/mm/dd', freq='1T')}) df_time.info()
this with simple division will confirm you have the right data size
df_combined = pd.merge(df_time, df_price, on='date') print(df_combined.isna())
I then wanted to have the same price as the previous minute as no transactions of significant difference have occured, I did this through df_combined.ffill()