Search code examples
pythonpandasdataframeelementwise-operations

how to use elementwise dataframe manipulation?


I have 2 dataframes, representing the estimate mean and standard error.

import numpy as np
import scipy.stats as st

rows = (1, 2)
col1 = ["mean_x", "mean_y"]
col2 = ["std_x", "std_y"]

data1 = ([10, 20], [5, 10])
data2 = ([1, 2], [0.5, 1])

df1 = pd.DataFrame(data1, index = rows, columns = col1)
df2 = pd.DataFrame(data2, index = rows, columns = col2)

enter image description here

I want to manipulate these 2 dataframes elementwisely, to construct a dataframe of confidence interval at 95% level

The ideal format is

enter image description here

Running a loop seems to be awkward, I am wondering if there is any method that is more elegant, efficient?


Solution

  • You can use stats.norm.interval and find confidence interval at 95% level then create DataFrame like below:

    >>> from scipy import stats
    >>> twoDf = pd.concat([df1, df2], axis=1)
    
        mean_x  mean_y  std_x   std_y
    1     10      20     1.0    2
    2      5      10     0.5    1
    
    >>> cols = [('mean_x','std_x', 'interval_x'),('mean_y','std_y', 'interval_y')]
    
    >>> for col in cols:
    ...    twoDf[col[2]] = twoDf.apply(lambda row : \
    ...                                stats.norm.interval(0.95, loc=row[col[0]], scale=row[col[1]]), axis=1)
    
    >>> twoDf
    
        mean_x   mean_y   std_x     std_y     interval_x                                 interval_y
    1   10        20        1.0     2        (8.040036015459947, 11.959963984540053)    (16.080072030919894, 23.919927969080106)
    2   5         10        0.5     1        (4.020018007729973, 5.979981992270027)     (8.040036015459947, 11.959963984540053)