Search code examples
pythonpandaspanel-data

Python count number of daily changes in panel data


I have a pandas dataframe of panel data where each row is a time series for an individual, each column is a day in the time series. On a daily basis I would like to count the number of day on day changes so I can determine what percentage of individuals change each day.

indiv = ['Tom', 'Mike', 'Dave']
date_one = ['yes', 'yes', 'no']
date_two = ['no', 'yes', 'no']
date_three = ['yes', 'yes', 'yes']
df = pd.DataFrame(zip(indiv, date_one, date_two, date_three), columns = ['name', '2020-01-29', '2020-01-30', '2020-01-31'])

Solution

  • Set the index to the name then transpose the df.

    df = df.set_index('name').T
    
       name 2020-01-29 2020-01-30 2020-01-31
    0   Tom        yes         no        yes
    1  Mike        yes        yes        yes
    2  Dave         no         no        yes
    

    Then replace the 'yes' and 'no' values with True and False

    df.replace({'yes': True, 'no': False}, inplace=True)
    
       name  2020-01-29  2020-01-30  2020-01-31
    0   Tom        True       False        True
    1  Mike        True        True        True
    2  Dave       False       False        True
    

    Now sum across the columns...

    df['changes'] = df.sum(axis=1)
    

    Resulting in

    name          Tom  Mike   Dave  changes
    2020-01-29   True  True  False        2
    2020-01-30  False  True  False        1
    2020-01-31   True  True   True        3