Calculating volatility manually vs built-in functions are not the same

Can someone help me to understand where I'm wrong? I don't know why I get different volatility of each column...

This is an example of my code:

from math import sqrt
from numpy import around
from numpy.random import uniform
from pandas import DataFrame
from statistics import stdev

data = around(a=uniform(low=1.0, high=50.0, size=(500, 1)), decimals=3)
df = DataFrame(data=data, columns=['close'], dtype='float64')
df.loc[:, 'delta'] = df.loc[:, 'close'].pct_change().fillna(0).round(3)

volatility = []

for index in range(df.shape[0]):
    if index < 90:
        volatility.append(0)
    else:
        start = index - 90
        stop = index + 1
        volatility.append(stdev(df.loc[start:stop, 'delta']) * sqrt(252))

df.loc[:, 'volatility1'] = volatility
df.loc[:, 'volatility2'] = df.loc[:, 'delta'].rolling(window=90).std(ddof=0) * sqrt(252)

print(df)

      close   delta  volatility1  volatility2
0    10.099   0.000     0.000000          NaN
1    26.331   1.607     0.000000          NaN
2    32.361   0.229     0.000000          NaN
3     2.068  -0.936     0.000000          NaN
4    36.241  16.525     0.000000          NaN
..      ...     ...          ...          ...
495  48.015  -0.029    46.078037    46.132943
496   6.988  -0.854    46.036210    46.178820
497  23.331   2.339    46.003184    45.837245
498  25.551   0.095    45.608260    45.792188
499  46.248   0.810    45.793012    45.769787

[500 rows x 4 columns]

Thanks you so much!

Solution

There are three small changes needed. Added comments inline. 89 is needed since endpoint inclusive (unlike a lot of other python stuff). ddof=1 is needed because stdev uses this by default. This article talks about numpy std instead of stdev but the theory of what ddof is doing is still the same.

Also, in the future, try changing size to something like 95. You don't need the other 405 rows when debugging and it is nice to see the changeover from 0/NaN to actual volatility to see you need 89 not 90.

The 0 vs NaN difference still exists. This is a result of you appending 0 and rolling's default behavior. I wasn't sure if that was intentional or not so I left it.

from math import sqrt
from numpy import around
from numpy.random import uniform
from pandas import DataFrame
from statistics import stdev

data = around(a=uniform(low=1.0, high=50.0, size=(500, 1)), decimals=3)
df = DataFrame(data=data, columns=['close'], dtype='float64')
df['delta'] = df['close'].pct_change().fillna(0).round(3)

volatility = []

for index in range(df.shape[0]):
    if index < 89: #change to 89
        volatility.append(0)
    else:
        start = index - 89 #change to 89
        stop = index
        volatility.append(stdev(df.loc[start:stop, 'delta']) * sqrt(252))

df['volatility1'] = volatility
df['volatility2'] = df.loc[:, 'delta'].rolling(window=90).std(ddof=1) * sqrt(252) #change to ddof=1

print(df)