Is there a Python data structure that seamlessly combines a dictionary (with nested dictionaries or lists as values) and a heap, allowing sorting based on a specific value within the nested structures?
cache = {"key1": {"time": time1, "info": "key1 info"}, "key2": {"time": time2, "info": "key2 info"}, ...}
or:
cache = {"key1": [time1, "key1 info"], "key2": [time2, "key2 info"], ...}
here time1
, time2
, ... is the time of insertion or update of the entry.
The goal is to implement an efficient cache, checking key existence, validating the value's freshness (as it becomes outdated over time), and removing the oldest key when the cache is full. The dictionary should support heap operations either by using the nested key "time" or the zeroth element of the list.
Current options considered:
Is there a more efficient solution or a different approach that avoids creating a custom data structure?
Python's dict
is already insertion ordered (on old versions, use collections.OrderedDict
) similar to a time-sorted heap. This means the oldest inserted item is always at the front. Instead of updating values, remove-and-insert the item instead to force moving updated items to the back.
In order to use dict
as a time-based cache, use the following approach. You can turn this into a separate subclass, provide helper functions, or add the code inline. A subclass is nicer and avoids misuses but involves many special methods, so I will show the simpler helper functions and assume a Time To Live (ttl
) is given.
"key": (time, value)
. It is advantageous to have immutable values (i.e. a tuple
or NamedTuple
) to prevent invalidating the constraint of a valid time
.def set(cache, key, value):
cache.pop(key, None) # clear the previous position if any
cache[key] = (time.monotonic(), value))
def get(cache, key):
key_time, value = cache[key]
if key_time < time.monotonic() + ttl: # check timestamp validity
del cache[key]
raise KeyError(key)
return value
def free(cache):
if cache:
oldest_key = next(iter(cache))
del cache[oldest_key]
def clean(cache):
outdated, deadline = [], time.monotonic() + ttl
for key, (key_time, _) in cache.items():
if key_time < deadline:
outdated.append(key)
else: # all following keys are valid as well
break
for key in outdated:
del cache[key]