Search code examples
grpcgrpc-javaenvoyproxy

Envoy Proxy with GRPC Server Streaming throwing UNAVAILABLE: upstream request timeout


We are having GRPC client and GRPC server with service side streaming support.

rpc LotsOfReplies(HelloRequest) returns (stream HelloResponse);

GRPC server is running behind the Envoy proxy with GRPC configuration.

The problem is when we connect the GRPC client to Envoy Proxy -> Grpc Server we are getting the below exception. The code perfectly fine when we connect the GRPC client directly to GRPC server without Envoy Proxy.

io.grpc.StatusRuntimeException: UNAVAILABLE: upstream request timeout
    at io.grpc.Status.asRuntimeException(Status.java:535)
    at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:478)
    at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:553)
    at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:68)
    at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:739)
    at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:718)
    at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
    at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

The sample envoy.yaml configuration is as follows for reference. Any help on this front will be very helpful.

static_resources:

  listeners:
    - name: listener_0
      address:
        socket_address:
          address: X.X.X.X
          port_value: 443
          ipv4Compat: true
      filter_chains:
        - filter_chain_match: {}
          transport_socket:
            name: envoy.transport_sockets.tls
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
              common_tls_context:
                tls_params:
                  cipher_suites:
                    - ECDHE-ECDSA-AES128-GCM-SHA256
                    - ECDHE-RSA-AES128-GCM-SHA256
                    - ECDHE-ECDSA-AES128-SHA
                    - ECDHE-RSA-AES128-SHA
                    - AES128-GCM-SHA256
                    - AES128-SHA
                    - ECDHE-ECDSA-AES256-GCM-SHA384
                    - ECDHE-RSA-AES256-GCM-SHA384
                    - ECDHE-ECDSA-AES256-SHA
                    - ECDHE-RSA-AES256-SHA
                    - AES256-GCM-SHA384
                    - AES256-SHA
                  ecdh_curves:
                    - P-256
                tls_certificates:
                  - certificate_chain:
                      filename: "/home/.tomcat_cert.pem"
                    private_key:
                      filename: "/home/.tomcat_key.pem"
                validation_context:
                  trust_chain_verification: ACCEPT_UNTRUSTED
                alpn_protocols:
                  - h2
              require_client_certificate: false
          filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                stat_prefix: ingress_http
                http_filters:
                  - name: envoy.filters.http.router
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: ["*"]
                      routes:
                        - match:
                            prefix: "/api.ApiService"
                          route:
                            cluster: grpc-server
                            idle_timeout: 0s
                            max_stream_duration:
                              grpc_timeout_header_max: 35s
                        - match:
                            prefix: "/site"
                          route:
                            cluster: site_router
  clusters:
    - name: site_router
      type: static
      # Comment out the following line to test on v6 networks
      lb_policy: round_robin
      connect_timeout: 25s
      http2_protocol_options: {}
      load_assignment:
        cluster_name: site_router
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: 127.0.0.1
                      port_value: 7880
    - name: grpc-server
      type: static
      # Comment out the following line to test on v6 networks
      lb_policy: round_robin
      connect_timeout: 25s
      http2_protocol_options: {}
      load_assignment:
        cluster_name: grpc-server
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: 127.0.0.1
                      port_value: 7879

Please note that This works fine for unary gRPC calls as shown below.

rpc SayHello(HelloRequest) returns (HelloResponse);

But Not For Server Streaming

rpc LotsOfReplies(HelloRequest) returns (stream HelloResponse);


Solution

  • Providing timeout:0s inside envoy.yaml solve my problem.

    static_resources:
    
      listeners:
        - name: listener_0
          address:
            socket_address:
              address: X.X.X.X.
              port_value: 443
              ipv4Compat: true
          filter_chains:
            - filter_chain_match: {}
              transport_socket:
                name: envoy.transport_sockets.tls
                typed_config:
                  "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
                  common_tls_context:
                    tls_params:
                      cipher_suites:
                        - ECDHE-ECDSA-AES128-GCM-SHA256
                        - ECDHE-RSA-AES128-GCM-SHA256
                        - ECDHE-ECDSA-AES128-SHA
                        - ECDHE-RSA-AES128-SHA
                        - AES128-GCM-SHA256
                        - AES128-SHA
                        - ECDHE-ECDSA-AES256-GCM-SHA384
                        - ECDHE-RSA-AES256-GCM-SHA384
                        - ECDHE-ECDSA-AES256-SHA
                        - ECDHE-RSA-AES256-SHA
                        - AES256-GCM-SHA384
                        - AES256-SHA
                      ecdh_curves:
                        - P-256
                    tls_certificates:
                      - certificate_chain:
                          filename: "/home/.tomcat_cert.pem"
                        private_key:
                          filename: "/home/.tomcat_key.pem"
                    validation_context:
                      trust_chain_verification: ACCEPT_UNTRUSTED
                    alpn_protocols:
                      - h2
                  require_client_certificate: false
              filters:
                - name: envoy.filters.network.http_connection_manager
                  typed_config:
                    "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                    stat_prefix: ingress_http
                    stream_idle_timeout: 0s
                    http_filters:
                      - name: envoy.filters.http.router
                    route_config:
                      name: local_route
                      virtual_hosts:
                        - name: local_service
                          domains: ["*"]
                          routes:
                            - match:
                                prefix: "/api.ApiService"
                              route:
                                cluster: grpc-server
                                timeout: 0s
                            - match:
                                prefix: "/policy"
                              route:
                                cluster: site_router
                                timeout: 30s
      clusters:
        - name: site_router
          type: static
          # Comment out the following line to test on v6 networks
          lb_policy: round_robin
          connect_timeout: 25s
          http2_protocol_options: {}
          load_assignment:
            cluster_name: site_router
            endpoints:
              - lb_endpoints:
                  - endpoint:
                      address:
                        socket_address:
                          address: 127.0.0.1
                          port_value: 7880
        - name: grpc-server
          type: static
          # Comment out the following line to test on v6 networks
          lb_policy: round_robin
          connect_timeout: 25s
          http2_protocol_options: {}
          load_assignment:
            cluster_name: grpc-server
            endpoints:
              - lb_endpoints:
                  - endpoint:
                      address:
                        socket_address:
                          address: 127.0.0.1
                          port_value: 7879