Search code examples
pythonpandasdataframemean

How can I calculate the mean of a specific cell in a dataframe in Python?


Dataframe

I have a dataframe that has multiple floats per cell. I need to calculate the mean of each of these cells and have the results put into a new dataframe.

How can I do that in python?


Solution

  • General solution is elementwise mean:

    print (df)
         Sub_1    Sub_2
    0  [1,2,3]  [4,5,3]
    1  [1,7,3]  [4,8,3]
    

    If same values in each column is possible create 3d numpy array and then count mean:

    arr = np.mean(np.array(df.to_numpy().tolist()), axis=2)
    
    df1 = pd.DataFrame(arr, columns=df.columns, index=df.index)
    print (df1)
          Sub_1  Sub_2
    0  2.000000    4.0
    1  3.666667    5.0
    
    df1 = df.applymap(np.mean)
    print (df1)
          Sub_1  Sub_2
    0  2.000000    4.0
    1  3.666667    5.0
    

    Or:

    df1 = df.explode(['Sub_1','Sub_2']).groupby(level=0).mean()