Search code examples
pythonpandasmemoizationmutable

Pandas Series argument function memoization


I want to memoize a function with mutable parameters (Pandas Series objects). Is there any way to accomplish this?

Here's a simple Fibonacci example, the parameter is a Pandas Series where the first element represents the sequence's index.

Example:

from functools import lru_cache

@lru_cache(maxsize=None)
def fib(n):
    if n.iloc[0] == 1 or n.iloc[0] == 2:
        return 1
    min1 = n.copy()
    min1.iloc[0] -=1
    min2 = n.copy()
    min2.iloc[0] -= 2 
    return fib(min1) + fib(min2)

Call function:

fib(pd.Series([15,0]))

Result:

TypeError: 'Series' objects are mutable, thus they cannot be hashed

The intended use is more complex, so I posted this useless but simple example.


Solution

  • I wrote a wrapper to replace the Pandas Series argument with a tuple (frozen equivalent) as @abarnert and @Calvin suggested. Since tuples are immutable, the function can now be memoized.

    def freeze_series(f):
        def wrapper(series):
            return f(tuple(series.to_dict(OrderedDict).items()))
        return wrapper
    

    Here's a normal function to unfreeze the tuple back into a Pandas Series:

    def unfreeze_series(frozen_series):
        return pd.Series(OrderedDict((x, y) for x, y in frozen_series))
    

    It can be implemented like this to solve the question example:

    from functools import lru_cache
    
    @freeze_series
    @lru_cache(maxsize=None)
    def fib(n):
        n = unfreeze_series(n)
        if n.iloc[0] == 1 or n.iloc[0] == 2:
            return 1
        min1 = n.copy()
        min1.iloc[0] -=1
        min2 = n.copy()
        min2.iloc[0] -= 2 
        return fib(min1) + fib(min2)