Context: I'm working on an analytics system for an ordering system. There are about 100,000 orders per day and the analytics need to run for the last N (say, 100) days months. The relevant data fits in memory. After N days, all orders are evicted from the memory cache, with an entire day in the past being evicted. Orders can be created or updated.
A traditional approach would use a ConcurrentHashMap<Date, Queue<Order>>
. Every day, values for keys representing dates more than N days in the past will be deleted. But, of course, the whole point of using Guava is to avoid this. EDIT: changed Map
to ConcurrentHashMap
, see the end of the question for rationale.
With Guava collections, a MultiMap <Date, Order>
would be simpler. Eviction is similar, implemented explicitly.
While the Cache
implementation looks appealing (after all, I am implementing a Cache), I'm not sure about the eviction options. Eviction only happens once a day and its best initiated from outside the cache, I don't want the cache to have to check the age of an order. I'm not even sure if the cache would use a MultiMap, which I think it's a suitable data structure in this case.
Thus, my question is: is it possible to use a Cache that uses and exposes the semantics of a MultiMap and allows evictions controlled from outside itself, in particular with the rule I need ("delete all orders older than N days") ?
As an important clarification, I'm not interested in a LoadingCache
but I do need bulk loads (if the application needs to be restarted, the cache has to be populated, from the database, with the last N days of orders).
EDIT: Forgot to mention that the map needs to be concurrent, as orders come in they are evaluated live against the previous orders for the same customer or location etc.
EDIT2: Just stumbled over Guava issue 135. It looks like the MultiMap is not concurrent.
I would use neither a Cache
nor a Multimap
here. While I like and use both of them, there's not much to gain here.
Cache
don't really get used here.ConcurrentHashMap<Date, Queue<Order>>
, which is in a sense more powerful than a Multimap<Date, Order>
.I'd use a Cache
, if I thought about different eviction criteria and if I felt like losing any of its entries anytime1 is fine.
You may find out that you need a ConcurrentMap<Date, Dequeue<Order>>
or maybe ConcurrentMap<Date, YouOwnQueueFastSearchList<Order>>
or whatever. This could probably be managed somehow by the Multimap
, but IMHO it gets more complicated instead of simpler.
I'd ask myself "what do I gain by using Cache
or Multimap
here?". To me it looks like the plain old ConcurrentMap
offers about everything you need.
1 By no means I'm suggesting this would happen with Guava. On the opposite, without an eviction reason (capacity, expiration, ...) it works just like a ConcurrentMap
. It's just that what you've described feels more like a Map
rather than a Cache
.