Search code examples
python-3.xcsvaggregate

What's the easiest way to aggregate my CSV data in Python 3.9?


I'm using Python 3.9. I'm trying to parse this CSV file that has 3 columns of data

55,Fake ISD,SUCCESS
56,Other ISD,None
57,Third ISD,WARNING
58,Fourth ISD,FAILURE
59,Main ISD,SUCCESS
60,Secondary ISD,SUCCESS

I was wondering if there is some out-of-the-box library that would parse the CSV to aggregate the data based on results of the third column. That is, I woudl want a report taht lists

SUCCESS - 3 entries - Fake ISD, Main ISD, Secondary ISD
WARNING - 1 entry - Third ISD
FAILURE - 1 entry - Fourth ISD
None - 1 entry - Other ISD

How would I aggregate these in Python 3.9?


Solution

  • You can try pandas:

    import pandas as pd
    
    df = pd.read_csv("your_file.csv", header=None)
    
    x = df.groupby(2)[1].agg(list)
    for i, d in zip(x.index, x):
        print(f'{i} - {len(d)} - {", ".join(d)}')
    

    Prints:

    FAILURE - 1 - Fourth ISD
    None - 1 - Other ISD
    SUCCESS - 3 - Fake ISD, Main ISD, Secondary ISD
    WARNING - 1 - Third ISD