How does the server communicate back to the client in RAFT?

According to the RAFT paper its mentioned that each server in addition to the leader server has its own log entry and its state machine and each state machine processes the same sequence of commands from the log.

I have few queries on this scenario.

[1] If 1 client makes some request to the leader server it means that all follower servers do processes the requests and produce the outputs? But who communicates back to the client with the output?

[2] If the answer to the first question is only leader communicates the output back to the client then what is the use of multiple followers computing/processing same inputs in their state machines from the log entries. Because its already known that RAFT ensures all log entries must contain same commands in same order. Wont it be sufficient for just the leader to process the entry from the log in its state machine and give it back to the client?

[3] Also if there are multiple clients making same requests to the server is it only the leader that communicates the output to all the clients or does the follower comes into picture here?

Solution

The answer to your first question is indeed that the leader’s state machine output is returned to the client.
Technically, with the basic Raft protocol there’s no reason followers have to immediately apply entries to their state machines. Indeed, followers typically don’t even learn of an entry’s commitment until after the leader has already responded to the client. The primary reason for followers to apply commands to their state machines is simply to keep up with the leader. If the leader crashes, a follower will be elected leader and will need to take over servicing client requests. Once elected, the new leader will have to apply all unapplied commands to its state machine before it can begin servicing client requests. Applying commands on followers as they’re committed reduces the cost of leader changes, and the cost of applying commands on followers is low anyways since they’re not serving client requests.
There’s another reason to apply commands on followers, and your third question comes close to uncovering it. Only the leader ever responds to client write requests, but followers can respond to read requests with relaxed consistency guarantees (sequential consistency). In order to do this, the leader returns the write index for completed commands along with the output. The client can then query a follower, and once the follower’s state machine has reached at least the index of the client’s last write (supplied by the client), the follower can query the state machine and return the output. This allows clients to spread queries across the leader and followers, and it’s probably the best reason practical systems ensure followers’ state machines attempt to keep up with the leader’s state.