Search code examples
pythonpandasdataframeconcatenationnested-loops

Pandas add row for each cell where function is applied to that specific cell while other element of new row are equal to starting row


I want to loop through each element of a pandas dataframe row such that only that element is stressed (ie: it's multiplied by 10%) while the other elements of the row are kept equal.

I'm planning to use this for sensitivity analysis.

Example: df = pd.DataFrame({'AGE':[5,10],'POP':[100,200]})

AGE POP
5 100
10 200

Final desired output:

AGE POP
5 100
10 200
5*1.1 100
5 100*1.1
10*1.1 200
10 200*1.1

Solution

  • If you have 2 columns, you can multiply with the [1, stress] and its reverse those columns, concatanate them while sorting to preserve multiplied column order. Lastly, prepend the original frame as well:

    stress = 1.1
    factor = [stress, 1]
    pd.concat([df,
               pd.concat([df.mul(factor),
                          df.mul(factor[::-1])]).sort_index()
              ], ignore_index=True)
    
        AGE    POP
    0   5.0  100.0
    1  10.0  200.0
    2   5.5  100.0
    3   5.0  110.0
    4  11.0  200.0
    5  10.0  220.0
    

    Generalizing to N columns could be via a comprehension:

    def gen_factors(stress, N):
        for j in range(N):
            # make all-1s list, except j'th is `stress`
            f = [1] * N
            f[j] = stress
            yield f
    
    stress = 1.1
    N = len(df.columns)
    pd.concat([df,
               pd.concat(df.mul(factor)
                         for factor in gen_factors(stress, N)).sort_index()
              ], ignore_index=True)
    

    Example run for a 3-column frame:

    >>> df
    
       AGE  POP  OTHER
    0    5  100      7
    1   10  200      8
    
    >>> # output of above:
    
        AGE    POP  OTHER
    0   5.0  100.0    7.0
    1  10.0  200.0    8.0
    2   5.5  100.0    7.0
    3   5.0  110.0    7.0
    4   5.0  100.0    7.7
    5  11.0  200.0    8.0
    6  10.0  220.0    8.0
    7  10.0  200.0    8.8