Can someone help me to understand where I'm wrong? I don't know why I get different volatility of each column...
This is an example of my code:
from math import sqrt
from numpy import around
from numpy.random import uniform
from pandas import DataFrame
from statistics import stdev
data = around(a=uniform(low=1.0, high=50.0, size=(500, 1)), decimals=3)
df = DataFrame(data=data, columns=['close'], dtype='float64')
df.loc[:, 'delta'] = df.loc[:, 'close'].pct_change().fillna(0).round(3)
volatility = []
for index in range(df.shape[0]):
if index < 90:
volatility.append(0)
else:
start = index - 90
stop = index + 1
volatility.append(stdev(df.loc[start:stop, 'delta']) * sqrt(252))
df.loc[:, 'volatility1'] = volatility
df.loc[:, 'volatility2'] = df.loc[:, 'delta'].rolling(window=90).std(ddof=0) * sqrt(252)
print(df)
close delta volatility1 volatility2
0 10.099 0.000 0.000000 NaN
1 26.331 1.607 0.000000 NaN
2 32.361 0.229 0.000000 NaN
3 2.068 -0.936 0.000000 NaN
4 36.241 16.525 0.000000 NaN
.. ... ... ... ...
495 48.015 -0.029 46.078037 46.132943
496 6.988 -0.854 46.036210 46.178820
497 23.331 2.339 46.003184 45.837245
498 25.551 0.095 45.608260 45.792188
499 46.248 0.810 45.793012 45.769787
[500 rows x 4 columns]
Thanks you so much!
There are three small changes needed. Added comments inline. 89 is needed since endpoint inclusive (unlike a lot of other python stuff). ddof=1
is needed because stdev uses this by default. This article talks about numpy std instead of stdev but the theory of what ddof is doing is still the same.
Also, in the future, try changing size to something like 95. You don't need the other 405 rows when debugging and it is nice to see the changeover from 0/NaN to actual volatility to see you need 89 not 90.
The 0 vs NaN difference still exists. This is a result of you appending 0 and rolling's default behavior. I wasn't sure if that was intentional or not so I left it.
from math import sqrt
from numpy import around
from numpy.random import uniform
from pandas import DataFrame
from statistics import stdev
data = around(a=uniform(low=1.0, high=50.0, size=(500, 1)), decimals=3)
df = DataFrame(data=data, columns=['close'], dtype='float64')
df['delta'] = df['close'].pct_change().fillna(0).round(3)
volatility = []
for index in range(df.shape[0]):
if index < 89: #change to 89
volatility.append(0)
else:
start = index - 89 #change to 89
stop = index
volatility.append(stdev(df.loc[start:stop, 'delta']) * sqrt(252))
df['volatility1'] = volatility
df['volatility2'] = df.loc[:, 'delta'].rolling(window=90).std(ddof=1) * sqrt(252) #change to ddof=1
print(df)