I have a dataframe:
country rating owner
0 England a John Smith
1 England b John Smith
2 France a Frank Foo
3 France a Frank foo
4 France a Frank Foo
5 France b Frank Foo
I'd like to produce a count of owners after grouping by country and rating and
I am expecting:
country rating owner count
0 England a John Smith 1
1 England b John Smith 1
2 France a Frank Foo 3
3 France b Frank Foo 1
I have tried:
df.group_by(['rating','owner'])['owner'].count()
and
df.group_by(['rating','owner'].str.lower())['owner'].count()
Use title
and replace
to rework the string and groupby.size
to aggregate:
out = (df.groupby(['country', 'rating',
df['owner'].str.title().str.replace(r'\s+', ' ', regex=True)])
.size().reset_index(name='count')
)
Output:
country rating owner count
0 England a John Smith 1
1 England b John Smith 1
2 France a Frank Foo 3
3 France b Frank Foo 1