I'm trying to create a dataframe with 'team','games','wins','losses' and 'ties.
Here's a snippet of the data:
[{'away_games': {'games': 4, 'losses': 2, 'ties': 0, 'wins': 2},
'conference': 'Mountain West',
'conference_games': {'games': 8, 'losses': 3, 'ties': 0, 'wins': 5},
'division': 'Mountain',
'expected_wins': 9.9,
'home_games': {'games': 7, 'losses': 1, 'ties': 0, 'wins': 6},
'team': 'Air Force',
'total': {'games': 13, 'losses': 3, 'ties': 0, 'wins': 10},
'year': 2022},
{'away_games': {'games': 8, 'losses': 6, 'ties': 0, 'wins': 1},
'conference': 'Mid-American',
'conference_games': {'games': 9, 'losses': 7, 'ties': 0, 'wins': 1},
'division': 'East',
'expected_wins': 1.5,
'home_games': {'games': 5, 'losses': 4, 'ties': 0, 'wins': 1},
'team': 'Akron',
'total': {'games': 13, 'losses': 10, 'ties': 0, 'wins': 2},
'year': 2022},
Here's the code I tried:
# Create an empty DataFrame
df = pd.DataFrame(columns=['team', 'games', 'wins', 'losses', 'ties'])
# Loop through each record in the data
for record in data:
try:
# Extract the desired values
team = record['team']
games = record['total'].get['games']
wins = record['total'].get['wins']
losses = record['total'].get['losses']
ties = record['total'].get['ties']
# Create a new row with the extracted values
new_row = {'team': team, 'games': games, 'wins': wins, 'losses': losses, 'ties': ties}
# Append the new row to the DataFrame
df = df.append(new_row, ignore_index=True)
except KeyError as e:
print(f"Skipping record due to missing key: {e}")
# Print the resulting DataFrame
print(df)
Im getting an error that the 'TeamRecord' object is not subscriptable.
I'm sure there's a better / easier to way to do this. Any advice would be much appreciated.
That's how it's supposed to look:
import pandas as pd
data=[{'away_games': {'games': 4, 'losses': 2, 'ties': 0, 'wins': 2},
'conference': 'Mountain West',
'conference_games': {'games': 8, 'losses': 3, 'ties': 0, 'wins': 5},
'division': 'Mountain',
'expected_wins': 9.9,
'home_games': {'games': 7, 'losses': 1, 'ties': 0, 'wins': 6},
'team': 'Air Force',
'total': {'games': 13, 'losses': 3, 'ties': 0, 'wins': 10},
'year': 2022},
{'away_games': {'games': 8, 'losses': 6, 'ties': 0, 'wins': 1},
'conference': 'Mid-American',
'conference_games': {'games': 9, 'losses': 7, 'ties': 0, 'wins': 1},
'division': 'East',
'expected_wins': 1.5,
'home_games': {'games': 5, 'losses': 4, 'ties': 0, 'wins': 1},
'team': 'Akron',
'total': {'games': 13, 'losses': 10, 'ties': 0, 'wins': 2},
'year': 2022}]
rows = []
# Loop through each record in the data
for record in data:
try:
# Extract the desired values
team = record['team']
games = record['total']['games']
wins = record['total']['wins']
losses = record['total']['losses']
ties = record['total']['ties']
# Create a new row with the extracted values
new_row = {'team': team, 'games': games, 'wins': wins, 'losses': losses, 'ties': ties}
rows.append(new_row)
except KeyError as e:
print(f"Skipping record due to missing key: {e}")
# Print the resulting DataFrame
df = pd.DataFrame(rows, columns=['team', 'games', 'wins', 'losses', 'ties'])
print(df)
team games wins losses ties
0 Air Force 13 10 3 0
1 Akron 13 2 10 0
It also looks like your data is borked since the total sum of wins, losses, and ties must result in the total number of games played. That's not the case for Akron
.
You don't use get
, see also Create a Pandas Dataframe by appending one row at a time regarding append
which has been deprecated and removed in Pandas>=2.0.0
. Appending in a loop is in most cases a bad practice.