Search code examples
pythonpandasdataframenumpycumsum

Summing up values in columns up to a defined value


I have a dataframe in which the columns look like this:

Date = [01/01/2021, 02/02/2021, .... ,12/31/2021]

T_mean = [1.2, 2.7, 3.5, 2.9, 4.4, .....]

I would like to add a column in the dataframe that sums the T_mean values as follows :

Sum_Tmean = [1.2, 3.9, 7.4, 10.3, 14.7 ....]. 

As soon as the value 10 is reached or exceeded, I would like to have the Date output on which this happens. I would also like to have the entire row highlighted in bold if possible.

I have formed the final sum of T_mean with the following code:

Sum_Tmean = dataframe['T_mean'].sum()

however, I don't know how to add the individual values.

I added the new column to the dataframe with the code:

dataframe.insert(3, "Sum_Tmean", Sum_Tmean, allow_duplicates=False).

I want to apply this to several decades and the temperature limit there is 200 °C, so this happens sometime in the year and not in the first few days of the year as in the example.

I appreciate any tips and thanks in advance.


Solution

  • I'd suggest using cumulative sum function cumsum from numpy , like below:

    import numpy as np
    import pandas as pd
    
    def highlight_bold(s):
        is_mos = df['Sum_Tmean'] > 10.0
        return ['font-weight: bold' if v else 'font-weight:' for v in is_mos]
    
    Date = ['01/01/2021', '02/02/2021' ,'12/31/2021', '01/01/2021', '02/02/2021' ,'12/31/2021']
    
    T_mean = [1.2, 2.7, 3.5, 2.9, 4.4, 5.2]
    
    Sum_Tmean = list(np.cumsum(T_mean))
    
    d = {'Date':Date, 'T_mean':T_mean,'Sum_Tmean':Sum_Tmean}
    
    df = pd.DataFrame(d)
    
    styler = df.style.apply(highlight_bold)
    
    styler
    

    output:

    enter image description here