I'm trying to compute the mean age of blonde people from the data in df
:
np.random.seed(0)
df = pd.DataFrame(
{
'Age': [18, 21, 28, 19, 23, 22, 18, 24, 25, 20],
'Hair colour': [
'Blonde', 'Brown', 'Black', 'Blonde', 'Blonde',
'Black','Brown', 'Brown', 'Black', 'Black'],
'Length (in cm)': np.random.normal(175, 10, 10).round(1),
'Weight (in kg)': np.random.normal(70, 5, 10).round(1)},
index=[
'Leon', 'Mirta', 'Nathan', 'Linda', 'Bandar',
'Violeta', 'Noah', 'Niji', 'Lucy', 'Mark'],)
I need to get the one number.
Firstly, I attempted to use the "df.divide".
import pandas as pd
import numpy as np
ans_3 = df({'Age'}).divide(df({'Hair colour': ['Blonde']}))
However, I have got this TypeError: 'DataFrame' object is not callable
.
What should I do for working my code that I'll get the appropriate result?
Run:
df[df['Hair colour'] == 'Blonde'].Age.mean()
Details:
df['Hair colour'] == 'Blonde'
- generates a Series of bool type,
stating whether the current row has Blonde hair.df[…]
- get rows meeting the above condition.Age
- from the above rows take only Age column.mean()
- compute the mean age.