We are trying to optimize the performance of hazelcast and we are running a 16 node(8 core VM) cluster such that we have total 4001 partitions in the cluster and we have configured 50 operation threads per node. We need improvement in performance i.e more throughput and smaller response time so we are thinking of configuring hazelcast.operation.generic.thread.count as well.
1) What is the difference between hazelcast.operation.generic.thread.count and hazelcast.operation.thread.count? What kind of of operations does hazelcast.operation.generic.thread handle?
2) The ratio between number of partitions and operation threads is about 5:1 , we intend to decrease this ratio as our assumption is that it will improve the performance. What is recommended, increasing nodes or no of operation thread counts in the same number of nodes?
3) Is linear scaling of hazelcast nodes keeping the number of cores same and memory same advisable in our situation?
As described in here, http://docs.hazelcast.org/docs/3.10.4/manual/html-single/index.html#partition-aware-operations, hazelcast.operation.thread.count
control the size of the thread pool for partition-aware operations, like imap.get/put/delete
etc. If you want to improve the performance of these operations, that you can modify this property. The default value of this property is the number of CPU cores, 8 in your case.
hazelcast.operation.generic.thread.count
control thread pool size for generic operations. like iexecutor.execute
etc. I believe you're not interested to improve performance for those kinds of operations.
One important thing is since you have 4001 partitions, what is your data size? Hazelcast suggests a partition should between 50-100 MB. (Please check https://hazelcast.com/resources/hazelcast-deployment-operations-guide/) So in your case, I expect you gave 200-400 GB data. If not, it means you have too many small partitions. This also affects performance.
Since you have 8 core on each VM. setting operations thread count to 50 doesn't increase performance too much, because you have 16 * 8 = 124 CPU cores in the cluster. Unless you add more CPU, just increasing thread count doesn't increase the performance, at least after some point. So you should add more nodes to the cluster or increase the CPU count for each VM. This will affect the performance drastically.