Search code examples
pythonpandasdataframedata-sciencepython-datetime

How to iterate over hours of a given day in python?


I have the following time series data of temperature readings:

DT                Temperature
01/01/2019 0:00     41
01/01/2019 1:00     42
01/01/2019 2:00     44
......
01/01/2019 23:00    41
01/02/2019 0:00     44

I am trying to write a function that compares the hourly change in temperature for a given day. Any change greater than 3 will increment quickChange counter. Something like this:

def countChange(day):
    for dt in day:
        if dt+1 - dt > 3: quickChange = quickChange+1

I can call the function for a day ex: countChange(df.loc['2018-01-01'])


Solution

  • Use Series.diff with compare by 3 and count Trues values by sum:

    np.random.seed(2019)
    
    rng = (pd.date_range('2018-01-01', periods=10, freq='H').tolist() +
          pd.date_range('2018-01-02', periods=10, freq='H').tolist())
    df = pd.DataFrame({'Temperature': np.random.randint(100, size=20)}, index=rng)  
    print (df)
                         Temperature
    2018-01-01 00:00:00           72
    2018-01-01 01:00:00           31
    2018-01-01 02:00:00           37
    2018-01-01 03:00:00           88
    2018-01-01 04:00:00           62
    2018-01-01 05:00:00           24
    2018-01-01 06:00:00           29
    2018-01-01 07:00:00           15
    2018-01-01 08:00:00           12
    2018-01-01 09:00:00           16
    2018-01-02 00:00:00           48
    2018-01-02 01:00:00           71
    2018-01-02 02:00:00           83
    2018-01-02 03:00:00           12
    2018-01-02 04:00:00           80
    2018-01-02 05:00:00           50
    2018-01-02 06:00:00           95
    2018-01-02 07:00:00            5
    2018-01-02 08:00:00           24
    2018-01-02 09:00:00           28
    

    #if necessary create DatetimeIndex if DT is column
    df = df.set_index("DT")
    
    def countChange(day):
        return (day['Temperature'].diff() > 3).sum()
    
    print (countChange(df.loc['2018-01-01']))
    4
    
    print (countChange(df.loc['2018-01-02']))
    9