Search code examples
pythonpython-3.xnumpycachinglru

How does Lru_cache (from functools) Work?


Especially when using recursive code there are massive improvements with lru_cache. I do understand that a cache is a space that stores data that has to be served fast and saves the computer from recomputing.

How does the Python lru_cache from functools work internally?

I'm Looking for a specific answer, does it use dictionaries like the rest of Python? Does it only store the return value?

I know that Python is heavily built on top of dictionaries, however, I couldn't find a specific answer to this question.


Solution

  • The functools source code is available here: https://github.com/python/cpython/blob/master/Lib/functools.py

    lru_cache uses the _lru_cache_wrapper decorator (python decorator with arguments pattern) which has a cache dictionary in context in which it saves the return value of the function called (every decorated function will have its own cache dict). The dictionary key is generated with the _make_key function from the arguments. Added some bold comments below:

    # ACCORDING TO PASSED maxsize ARGUMENT _lru_cache_wrapper
    # DEFINES AND RETURNS ONE OF wrapper DECORATORS
    
    def _lru_cache_wrapper(user_function, maxsize, typed, _CacheInfo):
        # Constants shared by all lru cache instances:
        sentinel = object()      # unique object used to signal cache misses
    
        cache = {}                                # RESULTS SAVES HERE
        cache_get = cache.get    # bound method to lookup a key or return None
    
        # ... maxsize is None:
    
        def wrapper(*args, **kwds):
            # Simple caching without ordering or size limit
            nonlocal hits, misses
            key = make_key(args, kwds, typed)     # BUILD A KEY FROM ARGUMENTS
            result = cache_get(key, sentinel)     # TRYING TO GET PREVIOUS CALLS RESULT
            if result is not sentinel:            # ALREADY CALLED WITH PASSED ARGS
                hits += 1
                return result                     # RETURN SAVED RESULT
                                                  # WITHOUT ACTUALLY CALLING FUNCTION
            misses += 1
            result = user_function(*args, **kwds) # FUNCTION CALL - if cache[key] empty
            cache[key] = result                   # SAVE RESULT
    
            return result
        # ...
    
        return wrapper