I program mostly by myself so nobody checks my code. I feel like I've developed a bunch of bad habits.
The code I am pasting here works, but I would like to hear some other solutions.
I create a dictionary called teams_shots
. I iterate through a pandas dataframe, which has the name of the away team and the home team in one row. I would like to keep track of shots made by each team that appears in the data frame. That is why I check if home_team_name
or away_team_name
do not have an entry in the dictionary, if so I create one.
for index,match in df.iterrows():
if match['home_team_name'] not in teams_shots:
#we have to setup an entry in the dictionary
teams_shots[match['home_team_name']]=[]
teams_shots[match['home_team_name']].append(match['home_team_shots'])
home_shots_avg.append(None)
else:
home_shots_avg.append(np.mean(teams_shots[match['home_team_name']]))
teams_shots[match['home_team_name']].append(match['home_team_shots'])
if match['away_team_name'] not in teams_shots:
teams_shots[match['away_team_name']]=[]
teams_shots[match['away_team_name']].append(match['away_team_shots'])
away_shots_avg.append(None)
else:
away_shots_avg.append(np.mean(teams_shots[match['away_team_name']]))
teams_shots[match['away_team_name']].append(match['away_team_shots'])
As you can see almost the same code is written twice, which is not a sign of good programming. I thought about using an or
operator in the if statement, but then one entry might already be made and I would truncate it. Any ideas how to write this code better.
In this case I think an additional for
loop should do the trick:
for index,match in df.iterrows():
for name, shots in {'home_team_name':'home_team_shots',
'away_team_name':'away_team_shots'}:
if match[name] not in teams_shots:
#we have to setup an entry in the dictionary
teams_shots[name]=[]
teams_shots[name].append(match[shots])
home_shots_avg.append(None)
else:
home_shots_avg.append(np.mean(teams_shots[name]))
But there might be a way to handle this in a vectorized fashion.