Search code examples
grpcgrpc-java

gRPC client failing with "CANCELLED: io.grpc.Context was cancelled without error"


I have a gRPC server written in C++ and a client written in Java. Everything was working fine using a blocking stub. Then I decided that I want to change one of the calls to be asynchronous, so I created an additional stub in my client, this one is created with newStub(channel) as opposed to newBlockingStub(channel). I didn't make any changes on the server side. This is a simple unary RPC call.

So I changed

Empty response = blockingStub.callMethod(request);

to

asyncStub.callMethod(request, new StreamObserver<Empty>() {
    @Override
    public void onNext(Empty response) {
       logInfo("asyncStub.callMethod.onNext");
    }

    @Override
    public void onError(Throwable throwable) {
       logError("asyncStub.callMethod.onError " + throwable.getMessage());
    }

    @Override
    public void onCompleted() {
        logInfo("asyncStub.callMethod.onCompleted");
    }
});

Ever since then, onError is called when I use this RPC (Most of the time) and the error it gives is "CANCELLED: io.grpc.Context was cancelled without error". I read about forking Context objects when making an RPC call from within an RPC call, but that's not the case here. Also, the Context seems to be a server side object, I don't see how it relates to the client. Is this a server side error propagating back to the client? On the server side everything seems to complete successfully, so I'm at a loss as to why this is happening. Inserting a 1ms sleep after calling asyncStub.callMethod seems to make this issue go away, but defeats the purpose. Any and all help in understanding this would be greatly appreciated.

Some notes:

  1. The processing time on the server side is around 1 microsecond
  2. Until now, the round trip time for the blocking call was several hundred microseconds (This is the time I'm trying to cut down, as this is essentially a void function, so I don't need to wait for a response)
  3. This method is called multiple times in a row, so before it used to wait until the previous one finished, now they just fire off one after the other.
  4. Some snippets from the proto file:
service EventHandler {
  rpc callMethod(Msg) returns (Empty) {}
}

message Msg {
  uint64 fieldA = 1;
  int32 fieldB = 2;
  string fieldC = 3;
  string fieldD = 4;
}

message Empty {

}

Solution

  • So it turns out that I was wrong. The context object is used by the client too. The solution was to do the following:

    Context newContext = Context.current().fork();
    Context origContext = newContext.attach();
    try {
        // TODO: Call async RPC here
    } finally {
        newContext.detach(origContext);
    }
    

    Hopefully this can help someone else in the future.