Question: How can I find the count of the most frequent and least frequent?
The output I want is:
cast count
Alan Marriott 100
Jandino Asporaa 78
...
Peter 1
#1 try:
df.groupby(by=['cast','show_id']).count()
output:
cast show_id type title director country date_added release_year rating duration listed_in description
4Minute 80161826 1 1 0 1 1 1 1 1 1 1
50 Cent 70199239 1 1 1 1 1 1 1 1 1 1
A.J LoCascio 80141858 1 1 1 1 1 1 1 1 1 1
#2 Try:
df.groupby(cast)[show_id].count()
Output:
NameError: name 'cast' is not defined
#3 Try:
df.groupby(by='cast')
Output:
<pandas.core.groupby.generic.DataFrameGroupBy object at 0x7f2f3894bcd0>
Sample of the dataset:
import pandas as pd
df = pd.DataFrame({
'show_id':['81145628','80117401','70234439'],
'type':['Movie','Movie','TV Show'],
'title':['Norm of the North: King Sized Adventure',
'Jandino: Whatever it Takes',
'Transformers Prime'],
'director':['Richard Finn, Tim Maltby',NaN,NaN],
'cast':['Alan Marriott, Andrew Toth, Brian Dobson',
'Jandino Asporaat','Peter Cullen, Sumalee Montano, Frank Welker'],
'country':['United States, India, South Korea, China',
'United Kingdom','United States'],
'date_added':['September 9, 2019',
'September 9, 2016',
'September 8, 2018'],
'release_year':['2019','2016','2013'],
'rating':['TV-PG','TV-MA','TV-Y7-FV'],
'duration':['90 min','94 min','1 Season'],
'listed_in':['Children & Family Movies, Comedies',
'Stand-Up Comedy','Kids TV'],
'description':['Before planning an awesome wedding for his',
'Jandino Asporaat riffs on the challenges of ra',
'With the help of three human allies, the Autob']})
This should work:
df.groupby('cast')['show_id'].count().nlargest()
This will return the count for each group, sorted by count in descending order:
cast count
Alan Marriott 100
Jandino Asporaa 78
...
Peter 1