Search code examples
pythonpandassum

How to group and sum data and return the biggest sum in Python?


Let's say my data looks like this:

news_title company
string Facebook
string Facebook
string Amazon
string Apple
string Amazon
string Facebook

How can I group the companies and get name and the number for the company with the biggest sum?

I want be able to print something like :

Facebook was mentioned in the news the most - 24 times.

I tried this but it did not work the way I wanted:

df.groupby("company").sum()

Solution

  • Use value_counts:

    >>> df.company.value_counts().head(1)
    Facebook    3
    Name: company, dtype: int64
    

    Update:

    Could you please tell me how I could go about printing it out in a sentence?

    company, count = list(df.company.value_counts().head(1).items())[0]
    print(f'{company} was mentioned in the news the most - {count} times.')
    
    # Output:
    Facebook was mentioned in the news the most - 3 times.