Search code examples
pythonpandasdivide

Divide rows of python pandas DataFrame


I have a pandas DataFrame df like this

   mat  time
0  101   20
1  102    7
2  103   15

I need to divide the rows so the column of time doesn't have any values higher than t=10 to have something like this

   mat  time
0  101   10
2  101   10
3  102    7
4  103   10
5  103    5

the index doesn't matter

If I'd use groupby('mat')['time'].sum() on this df I would have the original df, but I need like an inverse of the groupby func.

Is there any way to get the ungrouped DataFrame with the condition of time <= t?

I'm trying to use a loop here but it's kind of 'unPythonic', any ideas?


Solution

  • Use an apply function that loops until all are less than 10.

    def split_max_time(df):
        new_df = df.copy()
        while new_df.iloc[-1, -1] > 10:
            temp = new_df.iloc[-1, -1]
            new_df.iloc[-1, -1] = 10
            new_df = pd.concat([new_df, new_df])
            new_df.iloc[-1, -1] = temp - 10
        return new_df
    
    
    print df.groupby('mat', group_keys=False).apply(split_max_time)
    
       mat  time
    0  101    10
    0  101    10
    1  102     7
    2  103    10
    2  103     5