Search code examples
pythonexcelpython-3.xxlsxlrd

Trying to take a subset from an Excel


I am trying to write a simple program, just struggling to do it as I am learning Python.

I have an xlsx. It is of the format:

Team, Player

What I want to do is apply a filter to the field Team, then take a random subset of 10 players from EACH team.

I've started this out like so :

import xlrd

# First open the workbook
wb = xlrd.open_workbook('C:\Users\ADMIN\Desktop\1.xlsx')

# Then select the sheet. 
sheet = wb.sheet_by_name('Sheet_1')

# Then get values of each column. Excuse first item which is header so skip that
team = sheet.col_values(0)[1:]
players = sheet.col_values(1)[1:]

However I am kind of stuck with how to proceed here.

Can anyone offer any feedback/advice please ?


Solution

  • You can construct a dictionary keyed by the teams whose values are the list of players on those teams, and then sample from those lists:

    import random
    
    teams = {}
    for t,p in zip(team,players):
        if t in teams:
            teams[t].append(p)
        else:
            teams[t] = [p]
    
    samples = [random.sample(teams[t],10) for t in teams]