Java Caching frameworks for maintaining huge data

Java Caching frameworks for storing huge data.

Context: We are developing a Restful service using Jersey 2.6 and will deploy it on WAS 8.5. This service need to serve more than 10 million requests per day.

We need to implement a cache to store more than 300k object (data will come from DB). And we need some way to update the cache on a daily basis.

Is this approach of caching 300k object and updating them on a daily basis is recommended?
Are there any Java framework which supports this kind of functionality?

Solution

Your question is too general to get a clear answer. You need to be describe what the problem you are trying to solve is.

Are you concerned about response times?
Are you trying to protect your DB from doing heavy lifting?
Are expecting to have to scale out and want to be sure that you can deal with future loads?

Additionally some more contextual information would be useful, especially:

How dynamic is your data compared to your requests?
What percentage of your data population will be requested on average per day? (How many of the 3 lakh objects will be enquired upon at least once per day? If you don't know, provide your best guess).

Your figures given as 3 lakh (300k) data points and 10M requests means that you are expecting to hit each object on average 33 times a day, which indicates that you are more concerned about back end DB load than your responses being right up to date.

In my experience there are a lot of fairly primitive solutions which will work much better than going for a heavyweight distributed systems such as Mongo, Cassandra or Coherence.

My first response would be: Keep it simple - 300k objects is not too much to store in an internal hash table which you flush once a day and populate on first request.

If you need to scale horizontally, I would suggest Memcache Spymemcached with a 1 day cache time, which populate when you don't find an existing entry.

I would NOT go for something like Cassandra or Mongo unless you have real compelling reasons to require a persistent store. Rationale: Purging can become really onerous, especially if your data is fast moving. For example: Cassandra does not really know how to delete, but instead "tombstones" deleted entries, which means that your data store will simply grow and grow until you create a strategy for purging.