I have the following data frame:
Name | id | Model |
---|---|---|
Alice | alice_1 | (A_01), (A_02) |
Bob | bob_1 | (B_01) |
Alice | alice_2 | (A_01), (A_05) |
Alice | alice_3 | (A_01), (A_05) |
Bob | bob_2 | (B_01) |
Bob | bob_3 | (B_01) |
I would like to count the unique model values inside the brackets and append the count in front of the bracket like that:
Name | Model |
---|---|
Alice | 3x (A_01), 2x (A_05), 1x (A_02) |
Bob | 3x(B_01) |
I tried to use different approaches with group by and aggregate functions but could not find a way. Also I can use value counts and count each Model but then I don't know how to append the resulting number the the whole data frame.
Use Series.str.split
with DataFrame.explode
for new rows by joinjed values by ,
, then get counts by GroupBy.size
, sorting and add to Model
column, last aggregate join
:
df = (df.assign(Model = df['Model'].str.split(', '))
.explode('Model')
.groupby(['Name','Model'])
.size()
.sort_values(ascending=False)
.astype(str)
.add('x')
.reset_index(level=1)
.assign(Model = lambda x: x[0].str.cat(x['Model']))
.groupby('Name')['Model']
.agg(', '.join)
.reset_index())
print (df)
Name Model
0 Alice 3x(A_01), 2x(A_05), 1x(A_02)
1 Bob 3x(B_01)