Search code examples
javacachinggarbage-collectionehcachedistributed-computing

How to tweak cache intensive app in java?


does anyone know was the proper configuration/development approach when writing an application that only uses cache as store?

To give some background, the application doesn't need to store any information (it actually stores a timestamp but I'll explain that later) because it only reads from what another app writes. We have a stored procedure that reads from that application's database and returns us the information at that point. From the moment the application starts, any update is notified through a topic so that database is no longer needed (until next restart).

Once everything is loaded, every record in the cache has to be read when certain messages are consumed to loop through them an process them individually. The application keeps a Map of Lock objects, each one for each record in the cache, to avoid race conditions. If the record meets certain criteria, a timestamp is written to the cache and to a database using write-behind of up to 5000 records.

The application is already developed but I think we have some problems with GCs. We keep getting spikes and I would like to know if there is any recommendation on what to do to reduce them.

These are the things we've done so far:

  • There is a collection of Strings that are repeated over and over for each record. I'm interning these ones (we are using java 8)
  • The cache we are using is EhCache. To avoid recreating objects, the element from the cache is used directly.
  • Every variable is a long or a String, except for an enum value and a LocalDateTime that is required to do some date checks.
  • There are two caches. This is because, once a criteria is met, a timestamp has to be replicated to another instance of the app. For this, we are using JMS replication from EhCache that uses topics for these updates.
  • The timestamp updates don't happen very often so the impact this could have should be minimum.
  • The amount of records is, at the moment, 350000, each one with a bunch of Strings and longs alongside the enum and LocalDateTime mentioned before.
  • A random problem we have is that sometimes it throws GC overhead limit exceeded. Normally the application keeps lowering the amount of memory it uses after some GCs but it seems sometimes it cannot handle the load.
  • The box has 3GB of memory for this and the application after a major GC uses around 500MB for the cache.

Apart from this, I don't know how the JVM is configured or what kind of GC uses. Any ideas or any blogs or documents someone could suggest me to start reading?

Thanks!


Solution

  • As you are running Java 8 you could change the Garbage Collector. The so called "Garbage First" GC has been there as an option since early versions of Java 7. Problems from its infancy have been resolved and it is often recommended for interactive applications that need fast response.

    It can be enabled by using -XX:+UseG1GC and will become the default on Java 9.

    Read more about it at http://www.oracle.com/technetwork/tutorials/tutorials-1876574.html