Search code examples
javacluster-computingrpcjgroups

JGroups protocols for synchronous RPC calls


Our cluster of ~50 members uses JGroups for 1-2-1 RPC calls (synchronous).

Tens of thousands of calls are made in a day, which complete in the range of 10 ms to 1 hour. Request & response size are ranging from 0 to 100 MB.

Our hosts are scattered in different data centres, hence TCP is used.

There are no multi-cast messages, only synchronous RPC calls.

What all protocols from conf/tcp.xml should be used with latest JGroups version? Is there anything better than TCP Like TCP_NIO?


Solution

  • which complete in the range of 10 ms to 1 hour.

    If an RPC can take up to 1 hour, I don't think synchronous RPCs are the way to go; I'd rather suggest to switch to async ones... Alternatively, you could invoke RPCs which return a CompletableFuture, whose function is called whenever the call has completed. This has the advantage that you're not blocking a thread from the pool...

    I'd start out with tcp.xml and make changes as needed. E.g. increase the max_threads variable to 50, to accommodate all 50 members sendind at the same time.

    Also, think about whether to use regular or OOB RPCs: unless you need ordering, OOB RPCs can be delivered in parallel, whereas regular RPCs (by the same sender) aer delivered one-by-one.

    If you don't need state transfer, remove BARRIER and STATE_TRANSFER.

    I suggest write a simple perf test (or use UPerf) and measure whether the perf is suitable. I'd also use probe.sh to take a look at stats.

    You can tune so many things in JGroups, it would take too long to list all measures here...

    [1] http://www.jgroups.org/manual4/index.html#RpcDispatcher