Search code examples
pythonexcelpandasdataframe

Pandas dataframe group by column get count of two columns


I am pulling in data from an excel spreadsheet and creating a list of dictionary objects from the rows in that spreadsheet. I converted that to a pandas dataframe. In this data frame I have a column with unique names and then two more columns success and failed in these column I was keeping a count per so either 1 0.

Name   success   failed
Name1  1         0
Name1  0         1
Name1  1         0
Name1  0         1
Name1  1         0
Name2  1         0
Name2  1         0
Name2  0         1
Name3  1         0
Name3  0         1
Name3  0         1
Name3  1         0

I found using df.groupby("name").to_values() displays the name and success and failed column and the columns are grouped with the totals per name of success and failed, but I do not know how to continue working with this as a new dataframe object. If I output this to excel I just have one row with totals. Is there a way to create a new object from this with all my columns? My intention was to then show a percentage of a total from the two success and failed per name counts after that.

Name  Success
Name1 90%
Name2 94%
Name3 89%

Solution

  • Code

    out = df.groupby('Name')['success'].mean().map('{:.1%}'.format).reset_index()
    

    out:

        Name success
    0  Name1   60.0%
    1  Name2   66.7%
    2  Name3   50.0%