Pandas rolling window operation inconsistent results based on the length of a series

I stumbled upon a weird behaviour in the windowing functionality in pandas, it seems that a rolling sum operation gives different results depending on the length of the series itself.

Given 2 series:

s1 = pd.Series(np.arange(5), index=range(5)) # s1 = 1, 2, 3, 4
s2 = pd.Series(np.arange(2, 5), index=range(2, 5)) # s2 = 2, 3, 4, 5

We apply a rolling sum on both:

k = 0.1
r1 = (s1 * k).rolling(2).sum().dropna() # r1 = 1, 3, 5, 7
r2 = (s2 * k).rolling(2).sum().dropna() # r2 = 5, 7

# remove values from r1 which are not in r2
r1 = r1[r2.index] # r1 = 5, 7

# now r1 should be exactly the same as r2, let's check the indices:
all(r1.index == r2.index) # => true

However, if we check the values, they are not exactly equal:

r1.iloc[0] == r2.iloc[0] # => false
abs(r1.iloc[0] - r2.iloc[0]) < 0.000000000000001 # => true
abs(r1.iloc[0] - r2.iloc[0]) < 0.0000000000000001 # => false

I am aware that floating point operations are not exact, and I don't think the observed behaviour is a bug.

However, I would assume, that the same deterministic calculations within the window(s) are applied to both series, so I would expect that the results to be exactly the same.

I am curious as to what exactly is causing this behaviour in the implementation of the window operation.

Solution

I think it has to do with numpy.sum rather than rolling or the series length.

For floating point numbers the numerical precision of sum (and np.add.reduce) is in general limited by directly adding each number individually to the result causing rounding errors in every step. However, often numpy will use a numerically better approach (partial pairwise summation) leading to improved precision in many use-cases. This improved precision is always provided when no axis is given. When axis is given, it will depend on which axis is summed. Technically, to provide the best speed possible, the improved precision is only used when the summation is along the fast axis in memory. Note that the exact precision may vary depending on other parameters. In contrast to NumPy, Python's math.fsum function uses a slower but more precise approach to summation. Especially when summing a large number of lower precision floating point numbers, such as float32, numerical errors can become significant. In such cases it can be advisable to use dtype="float64" to use a higher precision for the output.

import math
import pandas as pd
import numpy as np

k = .1
s1 = pd.Series(np.arange(5), index=range(5))
s2 = pd.Series(np.arange(2, 5), index=range(2, 5))

r1_new = s1.mul(k).rolling(2).agg(math.fsum).dropna()
r2_new = s2.mul(k).rolling(2).agg(math.fsum).dropna()

r1_new.iloc[2:] == r2_new  # --> True, True

In this very specific case, less precision seems to work better - i.e., float32 rather than float64

k = .1
s1 = pd.Series(np.arange(5), index=range(5))
s2 = pd.Series(np.arange(2, 5), index=range(2, 5))

r1_new = s1.mul(k).astype(np.float32).rolling(2).sum().dropna()
r2_new = s2.mul(k).astype(np.float32).rolling(2).sum().dropna()

r1_new.iloc[2:] == r2_new  # --> True, True