Search code examples
pythontuplespython-itertools

How to group a list by value without causing an Attribute Error


I have a CSV, OutputA with format:

Position,Category,Name,Team,Points
1,A,James,Team 1,100
2,A,Mark,Team 2,95
3,A,Tom,Team 1,90

I am trying to get an output of a CSV which gets the total points for each team, the average points per team and the number of riders.

So output would be:

Team,Points,AvgPoints,NumOfRiders
Team1,190,95,2
Team2,95,95,1

I have this function to convert each row to a namedtuple:

fields = ("Position", "Category", "Name", "Team", "Points")
Results = namedtuple('CategoryResults', fields)

def csv_to_tuple(path):
    with open(path, 'r', errors='ignore') as file:
        reader = csv.reader(file)
        for row in map(Results._make, reader):
            yield row

Then this sorts the rows into a sorted list by there club:

moutputA = sorted(list(csv_to_tuple("Male/outputA.csv")), key=lambda k: k[3])

This returns a list like:

[CategoryResults(Position='13', Category='A', Name='Marek', Team='1', Points='48'), CategoryResults(Position='7', Category='A', Name='', Team='1', Points='70')]

I am confident that this so far is right although I could be wrong.

I am trying to create a new list of teams with the points (not yet added up).

For example:

[Team 1(1,2,3,4,5)]
[Team 2 (6,9,10)]
etc.

The idea is that I can find how many unique values of points there are (this equals the number of riders). However, when trying to group the list I have this code:

Clubs = []
Club_Points = []
for Names, Club in groupby(moutputA, lambda x: x[3]):
    for Teams in Names:
        Clubs.append(list(Teams))

for Club, Points in groupby(moutputA, lambda x: x[4]):
    for Point in Clubs:
        Club_Points.append(list(Point))

print(Clubs)

but this retuns this error:

    Teams.append(list(Team))
AttributeError: 'itertools._grouper' object has no attribute 'append'

Solution

  • If data.csv contains:

    Position,Category,Name,Team,Points
    1,A,James,Team 1,100
    2,A,Mark,Team 2,95
    3,A,Tom,Team 1,90
    

    Then this script:

    import csv
    from collections import namedtuple
    from itertools import groupby
    from statistics import mean
    
    fields = ("Position", "Category", "Name", "Team", "Points")
    Results = namedtuple('CategoryResults', fields)
    
    def csv_to_tuple(path):
        with open(path, 'r', errors='ignore') as file:
            next(file) # skip header
            reader = csv.reader(file)
            for row in map(Results._make, reader):
                yield row
    
    moutputA = sorted(csv_to_tuple("data.csv"), key=lambda k: k.Team)
    
    out = []
    for team, group in groupby(moutputA, lambda x: x.Team):
        group = list(group)
        d = {}
        d['Team'] = team
        d['Points'] = sum(int(i.Points) for i in group)
        d['AvgPoints'] = mean(int(i.Points) for i in group)
        d['NumOfRider'] = len(group)
        out.append(d)
    
    
    with open('data_out.csv', 'w', newline='') as csvfile:
        fieldnames = ['Team', 'Points', 'AvgPoints', 'NumOfRider']
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    
        writer.writeheader()
        for row in out:
            writer.writerow(row)
    

    Produces data_out.csv:

    Team,Points,AvgPoints,NumOfRider
    Team 1,190,95,2
    Team 2,95,95,1
    

    Screenshot from LibreOffice:

    enter image description here