Search code examples
javamultithreadingconcurrenthashmap

How to know that no operation is running on ConcurrentHashMap or in Idle state in JAVA?


I have a situation like, whenever my ConcurrentHashMap updates I need to clear an existing file and write the entire data into the File again. So every time I update clearing the file and writing the data again into the file causes high latency. So I am thinking that whenever my hashmap is in Idle state, like if no updating operation is going on then I will write the entire data into the file, else I will wait until the hashmap is idle.

Basically, I will be deleting Strings continuously from the Map. So everytime I delete a String from the HashMap writing to the file is a very costly operation. So is there a way to know that no deletion operation is going on the ConcurrentHashMap?


Solution

  • So is there a way to know that no deletion operation is going on the ConcurrentHashMap?

    Short answer: no there isn't a way.

    But even if there wasn't you would still get into problems. For example, suppose that new updates arrived immediately after you started clearing / writing.


    I think the solution is to use two maps and a queue.

    • When an update request happens:
      1. perform the update on the concurrent hashmap
      2. add the request to the queue
    • In a background thread:
      1. pull requests from the queue, and perform updates on the second (shadow) hashmap
      2. periodically or based on some other criteria, cease pulling requests and flush the shadow hashmap to the file.

    The primary hashmap is always updated quickly, and is always up to date. Operations updating and using the primary hashmap do not get (significantly) blocked.

    The queue provides request buffering while the shadow hashmap is being written.

    The second hashmap is only accessed by one thread, so it doesn't need to be concurrent. Therefore it will be faster.

    The state of the file will typically be a little behind the primary hashmap. But that is inevitable. The only way to avoid that is to block updates to the primary map ... which is what you are trying to avoid.


    Another way to approach this would be to make writing to the file faster. I suspect that the reason it is slow is because your current design requires you to clear and rewrite the file each time. Another approach would be to write only the changes to the file. This means you may have more work to do on restart ... assuming the purpose of the file is to record the map state so that you can restart.