Search code examples
pythondatatablegroupingaveragedata-cleaning

Add column of average in data frame using Python


I want to create a new data frame consisting of sex, amount of children, price of insurance, and if an individual is a smoker or not. Below is an example of my data frame.

Sex    Children Insurance Smoker
Male      3      392.48    Yes
Male      6      782.68    Yes
Male      6      438.21    No 
Female    1      125.98    Yes
Female    1      58.32     No
Female    4      585.12    Yes
Female    4      356.12    No

So far I got this using the code

df = pd.DataFrame(insurance).groupby(["sex", "children", "smoker"]).size()

#which outputs
sex      children   smoker
female   1          yes      1
         1          no       1
         4          yes      1
         4          no       1
male     3          yes      2
         6          yes      1
         6          no       1

How would I add a column of the average of insurance for each gender depending on how many children they have and if they smoke or not? I tried adding mean("insurance") but got an error, of course. Thank you so much for the help!


Solution

  • df.groupby(["Sex", "Children", "Smoker"],as_index=False)["Insurance"].mean()
    
    #output
    
        Sex Children Smoker Insurance
    0   Female  1     No    58.32
    1   Female  1     Yes   125.98
    2   Female  4     No    356.12
    3   Female  4     Yes   585.12
    4   Male    3     Yes   392.48
    5   Male    6     No    438.21
    6   Male    6     Yes   782.68
    

    Is that what you want?

         Sex       Children Smoker  size mean
    0     Female       1    No      1   58.32
    1     Female       1    Yes     1   125.98
    2     Female       4    No      1   356.12
    3     Female       4    Yes     1   585.12
    4       Male       3    Yes     1   392.48
    5       Male       6    No      1   438.21
    6       Male       6    Yes     1   782.68