I was trying to configure a retry policy from the client side for for some grpc services but it's not behaving the way I expect it to behave so I might be misunderstanding how retry policy works in grpc or there's a mistake in the policy. Here's the policy:
var retryPolicy = `{
"methodConfig": [{
"name": [{"service": "serviceA"}, {"service":"serviceB"}],
"timeout":"30.0s",
"waitForReady": true,
"retryPolicy": {
"MaxAttempts": 10,
"InitialBackoff": ".5s",
"MaxBackoff": "10s",
"BackoffMultiplier": 1.5,
"RetryableStatusCodes": [ "UNAVAILABLE", "UNKNOWN" ]
}
}]
}`
What I expected was that if the client's grpc request to a method defined in one the services(serviceA or serviceB) failed then I expect a retry and since waitForReady is true the client will block the call until a connection is available (or the call is canceled or times out) and will retry the call if it fails due to a transient error. But when I purposefully down the server which this request is going to. The client gets an Unavailable grpc status code and error is: Error while dialing dial tcp xx.xx.xx.xx:xxxx: i/o timeout
but the client didn't get this error message 30 seconds later, instead received this error right away. Could the reason be because of how I'm giving the service names? Does it need the path of the file where the service is defined? For a bit more context, the grpc service is defined in another package which the client imports. Any help would be greatly appreciated.
Looking through the documentation, came across this link: https://github.com/grpc/grpc-proto/blob/master/grpc/service_config/service_config.proto and on line 72 it mentions
message Name {
string service = 1; // Required. Includes proto package name.
string method = 2;
}
I wasn't adding the proto package name when listing the services. So the retry policy should be:
var retryPolicy = `{
"methodConfig": [{
"name": [{"service": "pkgA.serviceA"}, {"service":"pkgB.serviceB"}],
"timeout":"30.0s",
"waitForReady": true,
"retryPolicy": {
"MaxAttempts": 10,
"InitialBackoff": ".5s",
"MaxBackoff": "10s",
"BackoffMultiplier": 1.5,
"RetryableStatusCodes": [ "UNAVAILABLE", "UNKNOWN" ]
}
}]
}`
where pkgA and pkgB are the proto package names.