Let say I have some data associated with football
Date Home Away HomeGoal AwayGoal TotalGoal
2019 Arsenal MU 5 1 6
2019 MCity Liv 2 2 4
2019 MU Liv 3 4 7
2019 MCity MU 0 0 0
I want to create a column of data which show the average goals of that team in the recent 2 games. For example in the last row, I want to include a column that shows the average goal of MU in their last 2 games, which = (1+3)/2 = 2.
Is there any functions in python can achieve this?
For your requirement, you don't care if a team is Home or Away, only how many goal it scored per match. Try this:
# Rename the columns to make the unstacking operation a bit easier
# Always a good idea to specify an explicit `copy` when you intend
# to change the dataframe structure
>>> tmp = df[['Home', 'Away', 'HomeGoal', 'AwayGoal']].copy()
# Arrange the columns into a MultiIndex to make stacking easier
>>> tmp.columns = pd.MultiIndex.from_product([['Team', 'Goal'], ['Home', 'Away']])
# This is what `tmp` look like:
Team Goal
Home Away Home Away
0 Arsenal MU 5 1
1 MCity Liv 2 2
2 MU Liv 3 4
3 MCity MU 0 0
# And now the magic
>>> tmp.stack() \
.groupby('Team').rolling(2).mean() \
.groupby('Team').tail(1) \
.droplevel([1,2])
# Result
Goal
Team
Arsenal NaN
Liv 3.0
MCity 1.0
MU 1.5
Here's how it works:
stack
unpivots Home
and Away
so that for every match, we have 2 rows for Teams
and Goal
groupby('Team').rolling(2).mean()
gets the rolling average of goal scored over the last 2 games per teamgroupby('Team').tail(1)
gets the last of those rolling averages per team