pandas jupyter-notebook data-science data-analysis

Select the first row of each group after 'groupby()' and 'value_counts() function

I have a data set named new_data_set which looks like this:

I want to find genre which came the maximum number of times for each year.

So I did this:

new_data_set.groupby('release_year')['genre']).apply(lambda x: x.value_counts())`

And the result of it looks like this:result

Now I am in need to fetch the first row from each group to get the answer. So the result should look like this:

1960 Drama
1961 Drama
.
.

How should I do this?

Solution

Add index[0] and then reset_index:

new_data_set = pd.DataFrame({
         'release_year':[2004,2005,2004,2005,2005,2004],
         'genre':list('aaabbb')
})

df = (new_data_set.groupby('release_year')['genre']
                  .apply(lambda x: x.value_counts().index[0])
                  .reset_index()
                 )
print (df)
   release_year genre
0          2004     a
1          2005     b

advanced multi conditional list comprehension
Assignin lists as elements of CUDF DataFrame
Find unique values for all the columns of a dataframe
Using regex matched groups in pandas dataframe replace function
Time-series trend analysis in python
Convert a list of time string to unique string format
median() tries to change column to numeric
Pandas: change data type of Series to String
Find duplicates from pandas column of nested lists within previous rows with multiple conditions
Pandas Price Analysis
Combine two columns into one new column pandas
How to save all Plotly express graphs created using a for loop in a PDF?
Searching a column for a substring that matches a value from another column
Groupby by sum of revenue and the corresponding highest contributing month - Pandas
TypeError: Cannot convert numpy.ndarray to numpy.ndarray
Select rows from DataFrame where ID count is greater than X
fill nearest value in a column when null of pandas data frame
How to make a new date column off of a integer representation using python polars?
How to avoid output into scrollable frames in jupyter notebook?
How and why does Python's built-in round() function work flawlessly with pandas?
how to Send dataframe as html table with font styling based on text value as a email attachment
Unable to concatenate dataframes in streamlit
Take min and max dates for a sequence along a column
Python function to calculate a median without mean in a dataframe
Ubuntu 22.04 syntax warning importing Pandas
Group by Number, different size groups
Find non-overlapping intervals within DNA coordinates
Is pd.get_dummies() updated in newer versions of Pandas making it default to Booleans (True/False) instead of (0/1)?
Issue with pulling the data with EIA API with Python
How to expand a single-index DataFrame to a multiindex DataFrame in an efficient way? (python, pandas)