Search code examples
javacaffeinecaffeine-cache

How many Caffeine Cache instances in an application is too much?


I have a use case where I want to Cache a Map of elements against String keys where each element in the map can have its own expiry. I was planning to use a Cache of Caches and utilize the really cool variable expiry in Caffeine.

So something like.

Cache<String, Cache<String, ObjectWithVariableExpiry>>

Now, the internal Cache is supposed to be dynamically created and the parent cache can have thousands of entries. I'm wondering if this is ok to do or if it's a really bad use of Caffeine. My worry is that for each internal Cache<String, ObjectWithVariableExpiry> the timer threads/logic could become a resource hog.

Any suggestions are greatly appreciated.


Solution

  • I suppose there is no answer as to "too much" without profiling to see the impact on the heap, object churn, growth rate, etc.

    Is there behavior requiring the nested caching or could a single cache with a composite key suffice? This would have the same number of entries and variable expiration, but avoid the overhead of new cache instances. Typically nesting is to perform an operation around the group, e.g. customer-specific cache and invalidate all of their entries. If that's the case, there are alternatives like adding a generational id to the key, thereby allowing older generations to not be retrieved and evicted lazily. The internal data structures are amortized O(1) so the number of entries has a small impact on performance.

    The overhead of the cache instances is memory, as the cache does not create its own threads. The cache is backed by a ConcurrentHashMap, uses multiple ring buffers, a Timing Wheel for variable expiration, padding to protect against false sharing, and a CountMin sketch if size bounded. This makes the cache a heavier objects, but not excessive for a collection. If setting a Scheduler for prompt expiration, then it will schedule a single timer per cache instance.

    Most likely it won't be a problem. The cache is designed towards concurrency and longer lived usages. That means it isn't as optimal for non-concurrent cases with high instance creation, like a scoped to an http request. It would certainly work fine, but add more pressure to the garbage collector compared to a simpler data structure.

    Unfortunately from the question there isn't enough to give a good answer. It is probably okay, you might have simple solutions if there is a negative effect, and a load test might provide stronger confidence.