Search code examples
redisistiogoogle-cloud-memorystore

GCP Memorystore Redis Connection refused after enabling Istio


We have an existing GKE Cluster (1.16.9-gke.6) with some services talking to GCP Memorystore Redis instance. However, after enabling Istio (Version 1.16.3) on those pods, we started seeing connection refused errors to the redis instance. As we are just starting out with Istio, we are allowing all external traffic from the Service inside mesh, using:

meshConfig:
 outboundTrafficPolicy:
      mode: ALLOW_ANY

With this all outbound traffic goes to a PassthroughCluster, as expected and observed Kiali + Istio Proxy logs.

In addition, I can login to my app container and istio proxy container and netcat successfully to the redis instance.

I also tried adding a service entry for the redis instance in our helm chart templates:

apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
  name: gcp-memorystore-redis
spec:
  # note, host field is ignored for tcp
  hosts:
    - gcp-memorystore-redis
  addresses:
    - {{ .Values.redis.cidr }}
  endpoints:
    - address: {{ .Values.redis.nodeAddress }}
  ports:
    - number: 6379
      name: tcp-redis
      protocol: TCP
  resolution: STATIC
  location: MESH_EXTERNAL

However, I continue to get connection refused errors. We use Redisson java client library not sure if that matters.

As a workaround I am bypassing the Istio proxy by adding our cidr ip range using global.proxy.includeIPRanges and set enableProtocolSniffingForOutbound: false, which works for now but I would really like this to be configured as a ServiceEntry because I eventually want my traffic to be routed via an Egress gateway.

What is the correct format to specify the redis uri/ nodeAddresses in our application with the ServiceEntry about. This doesn't seem to work: redis://gcp-memorystore-redis:6379

Appreciate any help!


Solution

  • For anyone else who comes here, I was able to resolve this issue by adding the hack suggested https://github.com/istio/istio/issues/11130 before starting the app container. Similarly adding a preStop hook on Proxy container as mentioned: https://github.com/istio/istio/issues/7136

    values:
      global:
        proxy:
          lifecycle:
              preStop:
                exec:
                  command: [
                    "/bin/sh", "-c",
                    "while [ $(netstat -plunt | grep tcp | grep -v envoy | wc -l | xargs) -ne 0 ]; do printf 'Waiting for App Server to shutdown'; sleep 1; done; echo 'App server shutdown, shutting down proxy...'"
                  ]
    

    In future, if/when this PR: https://github.com/istio/istio/pull/24737 gets merged, things will slightly less hacky.