Search code examples
cachingcaffeine-cache

Delete Caffeine entries based on a timestamp condition


Is there a way to remove Caffeine entries based on a timestamp condition? Eg., At T1 I have following entries

K1 -> V1
K2 -> V2
K3 -> V3

At time T2 I update only K2 and K3. (I dont know if both entries will have exact timestamp. K2 might have T2 but K3 might be T2 + some nanos. But sake of this question let's assume they do)

Now I want caffeine to invalidate entry K1 -> V1 because T1 < T2.

One way to do this is to iterate over entries and check if their write timestamp is < T2. Collect such keys and in the end call invalidateKeys(keys).

Maybe there is a non-iterative way?


Solution

  • If you are using expireAfterWrite, then you can obtain a snapshot of entries in timestamp order. As this call requires obtaining the eviction lock, it provides an immutable snapshot rather than an iterator. That is messy, e.g. you have to provide a limit which might not be correct and it depends on expiration.

    Duration maxAge = Duration.ofMinutes(1);
    cache.policy().expireAfterWrite().ifPresent(policy -> {
      Map<K, V> oldest = policy.oldest(1_000);
      for (K key : oldest.keySet()) {
        // Remove everything written more than 1 minute ago
        policy.ageOf(key)
          .filter(duration -> duration.compareTo(maxAge) > 0)
          .ifPresent(duration -> cache.invalidate(key));
      }
    });
    

    If you maintain the timestamp yourself, then an unordered iteration is possible using the cache.asMap() view. That's likely simplest and fast.

    long cutoff = ...
    var keys = cache.asMap().entrySet().stream()
      .filter(entry -> entry.getValue().timestamp() < cutoff)
      .collect(toList());
    cache.invalidateAll(keys);
    

    An approach that won't work, but worth mentioning to explain why, is variable expiration, expireAfter(expiry). You can set a new duration on every read based on the prior setting. This takes effect after the entry is returned to the caller, so while you can expire immediately it will serve K1 (at least) once.

    Otherwise you could validate at retrieval time outside of the cache and rely on size eviction. The flaw with this approach is that it does pollute the cache will dead entries.

    V value = cache.get(key);
    if (value.timestamp() < cutoff) {
      cache.asMap().remove(key, value);
      return cache.get(key); // load a new value
    }
    return value;
    

    Or you could maintain your own write-order queue, etc. All of these get messy the fancier you get. For your case, likely a full iteration is the simplest and least error-prone approach.