I'm trying to setup a GRPC client in Python to hit a particular server. The server is setup to require authentication via access token. Therefore, my implementation looks like this:
def create_connection(target, access_token):
credentials = composite_channel_credentials(
ssl_channel_credentials(),
access_token_call_credentials(access_token))
target = target if target else DEFAULT_ENDPOINT
return secure_channel(target = target, credentials = credentials)
conn = create_connection(svc = "myservice", session = Session(client_id = id, client_secret = secret)
stub = FakeStub(conn)
stub.CreateObject(CreateObjectRequest())
The issue I'm having is that, when I attempt to use this connection I get the following error:
File "<stdin>", line 1, in <module>
File "\anaconda3\envs\test\lib\site-packages\grpc\_interceptor.py", line 216, in __call__
response, ignored_call = self._with_call(request,
File "\anaconda3\envs\test\lib\site-packages\grpc\_interceptor.py", line 257, in _with_call
return call.result(), call
File "anaconda3\envs\test\lib\site-packages\grpc\_channel.py", line 343, in result
raise self
File "\anaconda3\envs\test\lib\site-packages\grpc\_interceptor.py", line 241, in continuation
response, call = self._thunk(new_method).with_call(
File "\anaconda3\envs\test\lib\site-packages\grpc\_interceptor.py", line 266, in with_call
return self._with_call(request,
File "\anaconda3\envs\test\lib\site-packages\grpc\_interceptor.py", line 257, in _with_call
return call.result(), call
File "\anaconda3\envs\test\lib\site-packages\grpc\_channel.py", line 343, in result
raise self
File "\anaconda3\envs\test\lib\site-packages\grpc\_interceptor.py", line 241, in continuation
response, call = self._thunk(new_method).with_call(
File "\anaconda3\envs\test\lib\site-packages\grpc\_channel.py", line 957, in with_call
return _end_unary_response_blocking(state, call, True, None)
File "\anaconda3\envs\test\lib\site-packages\grpc\_channel.py", line 849, in _end_unary_response_blocking
raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "failed to connect to all addresses"
debug_error_string = "{
"created":"@1633399048.828000000",
"description":"Failed to pick subchannel",
"file":"src/core/ext/filters/client_channel/client_channel.cc",
"file_line":3159,
"referenced_errors":[
{
"created":"@1633399048.828000000",
"description":
"failed to connect to all addresses",
"file":"src/core/lib/transport/error_utils.cc",
"file_line":147,
"grpc_status":14
}
]
}"
I looked up the status code associated with this response and it seems that the server is unavailable. So, I tried waiting for the connection to be ready:
channel_ready_future(conn).result()
but this hangs. What am I doing wrong here?
UPDATE 1
I converted the code to use the async connection instead of the synchronous connection but the issue still persists. Also, I saw that this question had also been posted on SO but none of the solutions presented there fixed the problem I'm having.
UPDATE 2
I assumed that this issue was occurring because the client couldn't find the TLS certificate issued by the server so I added the following code:
def _get_cert(target: str) -> bytes:
split_around_port = target.split(":")
data = ssl.get_server_certificate((split_around_port[0], split_around_port[1]))
return str.encode(data)
and then changed ssl_channel_credentials()
to ssl_channel_credentials(_get_cert(target))
. However, this also hasn't fixed the problem.
The issue here was actually fairly deep. First, I turned on tracing and set GRPC log-level to debug and then found this line:
D1006 12:01:33.694000000 9032 src/core/lib/security/transport/security_handshaker.cc:182] Security handshake failed: {"created":"@1633489293.693000000","description":"Cannot check peer: missing selected ALPN property.","file":"src/core/lib/security/security_connector/ssl_utils.cc","file_line":160}
This lead me to this GitHub issue, which stated that the issue was with grpcio
not inserting the h2
protocol into requests, which would cause ALPN-enabled servers to return that specific error. Some further digging led me to this issue, and since the server I connected to also uses Envoy, it was just a matter of modifying the envoy deployment file so that:
clusters:
- name: my-server
connect_timeout: 10s
type: strict_dns
lb_policy: round_robin
http2_protocol_options: {}
hosts:
- socket_address:
address: python-server
port_value: 1337
tls_context:
common_tls_context:
tls_certificates:
alpn_protocols: ["h2"] <====== Add this.