Search code examples
javacachingguavamultimap

Cache or MultiMap for day-based cache expiration?


Context: I'm working on an analytics system for an ordering system. There are about 100,000 orders per day and the analytics need to run for the last N (say, 100) days months. The relevant data fits in memory. After N days, all orders are evicted from the memory cache, with an entire day in the past being evicted. Orders can be created or updated.

  1. A traditional approach would use a ConcurrentHashMap<Date, Queue<Order>>. Every day, values for keys representing dates more than N days in the past will be deleted. But, of course, the whole point of using Guava is to avoid this. EDIT: changed Map to ConcurrentHashMap, see the end of the question for rationale.

  2. With Guava collections, a MultiMap <Date, Order> would be simpler. Eviction is similar, implemented explicitly.

  3. While the Cache implementation looks appealing (after all, I am implementing a Cache), I'm not sure about the eviction options. Eviction only happens once a day and its best initiated from outside the cache, I don't want the cache to have to check the age of an order. I'm not even sure if the cache would use a MultiMap, which I think it's a suitable data structure in this case.

Thus, my question is: is it possible to use a Cache that uses and exposes the semantics of a MultiMap and allows evictions controlled from outside itself, in particular with the rule I need ("delete all orders older than N days") ?

As an important clarification, I'm not interested in a LoadingCache but I do need bulk loads (if the application needs to be restarted, the cache has to be populated, from the database, with the last N days of orders).

EDIT: Forgot to mention that the map needs to be concurrent, as orders come in they are evaluated live against the previous orders for the same customer or location etc.

EDIT2: Just stumbled over Guava issue 135. It looks like the MultiMap is not concurrent.


Solution

  • I would use neither a Cache nor a Multimap here. While I like and use both of them, there's not much to gain here.

    • You want to evict your entries manually, so the features of Cache don't really get used here.
    • You're considering ConcurrentHashMap<Date, Queue<Order>>, which is in a sense more powerful than a Multimap<Date, Order>.

    I'd use a Cache, if I thought about different eviction criteria and if I felt like losing any of its entries anytime1 is fine.

    You may find out that you need a ConcurrentMap<Date, Dequeue<Order>> or maybe ConcurrentMap<Date, YouOwnQueueFastSearchList<Order>> or whatever. This could probably be managed somehow by the Multimap, but IMHO it gets more complicated instead of simpler.

    I'd ask myself "what do I gain by using Cache or Multimap here?". To me it looks like the plain old ConcurrentMap offers about everything you need.


    1 By no means I'm suggesting this would happen with Guava. On the opposite, without an eviction reason (capacity, expiration, ...) it works just like a ConcurrentMap. It's just that what you've described feels more like a Map rather than a Cache.