I have the following pandas.core.series.Series
:
x=pd.Series([17.39, 8.70, 2.90, 1.45, 1.45])
for t=0,1,2,3,4
.
When I try x.ewm(span=2).mean()
I get the following results:
17.39, 10.87, 5.35, 2.71, 1.86
My understanding is that .ewm.mean()
uses the following explicit formula:
y[t] = (1 - alpha) * y[t-1] + alpha * x[t], where alpha = 2/(span+1)
y[0] = x[0]
Using the above formula:
EMWA[0] = x[0] = 17.39
EMWA[1] = (1-(2/(2+1)))*17.39 + (2/(2+1))*8.7 = 11.59
which is different to 10.87
.
EMWA[2] = (1-(2/(2+1)))*10.87 + (2/(2+1))*2.9 = 5.55
which is different to 5.35
.
EMWA[3] = (1-(2/(2+1)))*5.35 + (2/(2+1))*1.45 = 2.75
which is different to 2.71
. etc..
Could you please help me understand where these differences coming from? What I am missing or doing wrong here? Thank you.
The formula that you're using is correct for x.ewm(span=2, adjust = False).mean()
. Here is some simple code that correctly replicates the behavior under the default adjust = True
setting.
span = 2
alpha = 2/(span + 1)
num = 0
den = 0
for b in x:
num = (1 - alpha)*num + b
den = 1 + (1 - alpha)*den
print(num/den)
For more, see the documentation for ewm and look at the description for the adjust
parameter.
The documentation gives you a summation formula for the adjust = True
version for the EW function. If you want a comparable formula for the adjust = False
version, it turns out that EMWA[n] is given by the result of
result = (1-alpha)**n * x[0] + alpha * sum((1-alpha)**(n-k) for k in range(1,t+1))
That is,
n
____
╲
n ╲ (n - k)
(1 - α) ⋅ x + α ⋅ ╱ (1 - α) ⋅ x
0 ╱ k
‾‾‾‾
k = 1