Search code examples
pythonpandasdataframeoperation

Perform row multiplication in data frame


I want to perform the following operation in pandas, I wouldn't like to transform my Dataframe in array to perform.

date      A      B     C     D     E    ...
date1     0,03  0,02  0,01   0,01 0,234
date2     0,03  0,02  0,01   0,01 0,234
date3     0,03  0,02  0,01   0,01 0,234
date4     0,03  0,02  0,01   0,01 0,234

the numbers are not the same and have lots of decimal values. I want to create in another data frame the following :

date      value      
date1     (1+0,03)*(1+0,02)*(1+0,01)*(1+0,01)*(1+0,234)
date2     (1+0,03)*(1+0,02)*(1+0,01)*(1+0,01)*(1+0,234)
date3     (1+0,03)*(1+0,02)*(1+0,01)*(1+0,01)*(1+0,234)
date4     (1+0,03)*(1+0,02)*(1+0,01)*(1+0,01)*(1+0,234)

there are cells where the value is null, I want to skip those values. I would show what I have been trying, but what I did was transform to array and perform the operation, I loose my data and can't skip null values.


Solution

  • Create index by dates if necessary by DataFrame.set_index, then add 1 for each value and use DataFrame.prod:

    #if not numeric values replace , and convert to floats
    #df = df.replace(',','.', regex=True)
    df1 = df.set_index('date').astype(float).add(1).prod(axis=1).reset_index(name='value')
    print (df1)
        date     value
    0  date1  1.322499
    1  date2  1.322499
    2  date3  1.322499
    3  date4  1.322499
    

    Test with missing value:

    print (df)
        date     A     B     C     D      E
    0  date1  0,03  0,02  0,01  0,01    NaN
    1  date2  0,03  0,02  0,01  0,01  0,234
    2  date3  0,03  0,02  0,01  0,01  0,234
    3  date4  0,03  0,02  0,01  0,01  0,234
    
    df = df.replace(',','.', regex=True)
    
    print (df.set_index('date').astype(float).add(1))
              A     B     C     D      E
    date                                
    date1  1.03  1.02  1.01  1.01    NaN
    date2  1.03  1.02  1.01  1.01  1.234
    date3  1.03  1.02  1.01  1.01  1.234
    date4  1.03  1.02  1.01  1.01  1.234
    
    df1 = df.set_index('date').astype(float).add(1).prod(axis=1).reset_index(name='value')
    print (df1)
        date     value
    0  date1  1.071717
    1  date2  1.322499
    2  date3  1.322499
    3  date4  1.322499