Search code examples
javacachingoptimizationmemoryconcurrenthashmap

Writing a highly performant Cache


I wrote a stock market simulator which uses a ConcurrentHashMap as a cache.

The cache holds about 75 elements but they are updated and retrieved very quickly (~ 500 times a second).

Here is what I did:

Thread 1:

Connected to an outside system which provides me with streaming quotes for a given stock symbol.

Thread 2 (callback thread):

Waits till data is delivered to it by the outside system. Once it gets the data, it parses it, creates an immutable DataEntry object, caches it and sends a signal to thread3.

Thread 3 (Consumer thread): Upon receiving the signal, retrieve the DataEntry from the cache and uses it. (It is part of the task to not let thread2 push data directly to thread3).

public final class DataEntry{

      private final String field1;
      private final String field2;
      //...
      private final String field25;

      // Corresponding setters and getters

}

public final class Cache{

        private final Map<String, DataEntry> cache;

        public Cache( ){
           this.cache = new ConcurrentHashMap<String, DataEntry> ( 65, 0.75, 32 );
        }

        // Methods to update and retrieve DataEntry from the cache.
}

After running it through a profiler, I noticed that I am creating a lot of DataEntry object. And therefore eden is filling up very quickly.

So, I am thinking of tweaking the design a bit by:

a) Making the DataEntry class mutable.

b) Pre-populating the cache with empty DataEntry objects.

c) When the update arrives, retrieve the DataEntry object from the map and populate the fields.

This way, number of DataEntry object will be constant and equal to the number of elements.

My questions are:

a) Does this design have any concurrency issues that I may have introduced by making the DataEntry mutable.

b) Is there anything else I can do to optimize the cache?

Thanks.


Solution

  • I wouldn't worry about the speed of ConcurrentHashMap

    Map<Integer, Integer> map = new ConcurrentHashMap<>();
    long start = System.nanoTime();
    int runs = 200*1000*1000;
    for (int r = 0; r < runs; r++) {
        map.put(r & 127, r & 127);
        map.get((~r) & 127);
    }
    long time = System.nanoTime() - start;
    System.out.printf("Throughput of %.1f million accesses per second%n",
            2 * runs / 1e6 / (time / 1e9));
    

    prints

    Throughput of 72.6 million accesses per second
    

    This is far beyond the access rate you appear to be using.

    If you want to reduce garbage you can use mutable objects and primitive. For this reason I would avoid using String (as you appear to have far more strings than data entries)