Search code examples
pandasdataframerowmeanunassigned-variable

How to assigne a dataframe mean to specific rows of dataframe?


I have a data frame like this

df_a = pd.DataFrame({'a': [2, 4, 5, 6, 12],
                 'b': [3, 5, 7, 9, 15]})

    Out[112]: 
    a   b
0   2   3
1   4   5
2   5   7
3   6   9
4  12  15

and mean out

df_a.mean()

Out[118]: 
a   5.800
b   7.800
dtype: float64

I want this;

df_a[df_a.index.isin([3, 4])] = df.mean()

But I'm getting an error. How do I achieve this? I gave an example here. There are observations that I need to change a lot in the data that I am working with. And I keep their index values in a list


Solution

  • If you want to overwrite the values of rows in a list, you can do it with iloc

    df_a = pd.DataFrame({'a': [2, 4, 5, 6, 12], 'b': [3, 5, 7, 9, 15]})
    
    idx_list = [3, 4]
    df_a.iloc[idx_list,:] = df_a.mean()
    
    

    Output

         a    b
    0  2.0  3.0
    1  4.0  5.0
    2  5.0  7.0
    3  5.8  7.8
    4  5.8  7.8
    

    edit

    If you're using an older version of pandas and see NaNs instead of wanted values, you can use a for loop

    df_a_mean = df_a.mean()
    for i in idx_list:
        df_a.iloc[i,:] = df_a_mean