I am trying to implement a gRPC service on GKE (v1.11.2-gke.18) with mutual TLS auth.
When not enforcing client auth, the HTTP2 health check that GKE automatically creates responds, and everything connects issue.
When I turn on mutual auth, the health check fails - presumably because it cannot complete a connection since it lacks a client certificate and key.
As always, documentation is light and conflicting. I require a solution that is fully programmatic (I.e. no console tweaking), but I have not been able to find a solution, other than manually changing the health check to TCP.
From what I can see I am guessing that I will either need to:
service.alpha.kubernetes.io/app-protocols: '{"grpc":"HTTP2"}'
proprietary annotationOr perhaps there is something else that I have not considered? The config below works perfectly for REST and gRPC with TLS but breaks with mTLS.
service.yaml
apiVersion: v1
kind: Service
metadata:
name: grpc-srv
labels:
type: grpc-srv
annotations:
service.alpha.kubernetes.io/app-protocols: '{"grpc":"HTTP2"}'
spec:
type: NodePort
ports:
- name: grpc
port: 9999
protocol: TCP
targetPort: 9999
- name: http
port: 8080
protocol: TCP
targetPort: 8080
selector:
app: myapp
ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: io-ingress
annotations:
kubernetes.io/ingress.global-static-ip-name: "grpc-ingress"
kubernetes.io/ingress.allow-http: "true"
spec:
tls:
- secretName: io-grpc
- secretName: io-api
rules:
- host: grpc.xxx.com
http:
paths:
- path: /*
backend:
serviceName: grpc-srv
servicePort: 9999
- host: rest.xxx.com
http:
paths:
- path: /*
backend:
serviceName: grpc-srv
servicePort: 8080
It seems that there is currently no way to achieve this using the GKE L7 ingress. But I have been successful deploying an NGINX Ingress Controller. Google have a not bad tutorial on how to deploy one here.
This installs a L4 TCP load balancer with no health checks on the services, leaving NGINX to handle the L7 termination and routing. This gives you a lot more flexibility, but the devil is in the detail, and the detail isn't easy to come by. Most of what I found was learned trawling through github issues.
What I have managed to achieve is for NGINX to handle the TLS termination, and still pass through the certificate to the back end, so you can handle things such as user auth via the CN, or check the certificate serial against a CRL.
Below is my ingress file. The annotations are the minimum required to achieve mTLS authentication, and still have access to the certificate in the back end.
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: grpc-ingress
namespace: master
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/auth-tls-verify-client: "on"
nginx.ingress.kubernetes.io/auth-tls-secret: "master/auth-tls-chain"
nginx.ingress.kubernetes.io/auth-tls-verify-depth: "2"
nginx.ingress.kubernetes.io/auth-tls-pass-certificate-to-upstream: "true"
nginx.ingress.kubernetes.io/backend-protocol: "GRPCS"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/grpc-backend: "true"
spec:
tls:
- hosts:
- grpc.example.com
secretName: auth-tls-chain
rules:
- host: grpc.example.com
http:
paths:
- path: /grpc.AwesomeService
backend:
serviceName: awesome-srv
servicePort: 9999
- path: /grpc.FantasticService
backend:
serviceName: fantastic-srv
servicePort: 9999
A few things to note:
auth-ls-chain
secret contains 3 files. ca.crt
is the certificate chain and should include any intermediate certificates. tls.crt
contains your server certificate and tls.key
contains your private key.backend-protocol: "GRPCS"
is required to prevent NGINX terminating the TLS. If you want to have NGINX terminate the TLS and run your services without encryption, use GRPC
as the protocol.grpc-backend: "true"
is required to let NGINX know to use HTTP2 for the backend requests.The best part is that if you have multiple namespaces, or if you are running a REST service as well (E.g. gRPC Gateway), NGINX will reuse the same load balancer. This provides some savings over the GKE ingress, that would use a separate LB for each ingress.
The above is from the master namespace and below is a REST ingress from the staging namespace.
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
namespace: staging
annotations:
kubernetes.io/ingress.class: nginx
kubernetes.io/tls-acme: "true"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
tls:
- hosts:
- api-stage.example.com
secretName: letsencrypt-staging
rules:
- host: api-stage.example.com
http:
paths:
- path: /awesome
backend:
serviceName: awesom-srv
servicePort: 8080
- path: /fantastic
backend:
serviceName: fantastic-srv
servicePort: 8080
For HTTP, I am using LetsEncrypt, but there's plenty of information available on how to set that up.
If you exec into the ingress-nginx
pod, you will be able to see how NGINX has been configured:
...
server {
server_name grpc.example.com ;
listen 80;
set $proxy_upstream_name "-";
set $pass_access_scheme $scheme;
set $pass_server_port $server_port;
set $best_http_host $http_host;
set $pass_port $pass_server_port;
listen 442 proxy_protocol ssl http2;
# PEM sha: 142600b0866df5ed9b8a363294b5fd2490c8619d
ssl_certificate /etc/ingress-controller/ssl/default-fake-certificate.pem;
ssl_certificate_key /etc/ingress-controller/ssl/default-fake-certificate.pem;
ssl_certificate_by_lua_block {
certificate.call()
}
# PEM sha: 142600b0866df5ed9b8a363294b5fd2490c8619d
ssl_client_certificate /etc/ingress-controller/ssl/master-auth-tls-chain.pem;
ssl_verify_client on;
ssl_verify_depth 2;
error_page 495 496 = https://help.example.com/auth;
location /grpc.AwesomeService {
set $namespace "master";
set $ingress_name "grpc-ingress";
set $service_name "awesome-srv";
set $service_port "9999";
set $location_path "/grpc.AwesomeServices";
rewrite_by_lua_block {
lua_ingress.rewrite({
force_ssl_redirect = true,
use_port_in_redirects = false,
})
balancer.rewrite()
plugins.run()
}
header_filter_by_lua_block {
plugins.run()
}
body_filter_by_lua_block {
}
log_by_lua_block {
balancer.log()
monitor.call()
plugins.run()
}
if ($scheme = https) {
more_set_headers "Strict-Transport-Security: max-age=15724800; includeSubDomains";
}
port_in_redirect off;
set $proxy_upstream_name "master-analytics-srv-9999";
set $proxy_host $proxy_upstream_name;
client_max_body_size 1m;
grpc_set_header Host $best_http_host;
# Pass the extracted client certificate to the backend
grpc_set_header ssl-client-cert $ssl_client_escaped_cert;
grpc_set_header ssl-client-verify $ssl_client_verify;
grpc_set_header ssl-client-subject-dn $ssl_client_s_dn;
grpc_set_header ssl-client-issuer-dn $ssl_client_i_dn;
# Allow websocket connections
grpc_set_header Upgrade $http_upgrade;
grpc_set_header Connection $connection_upgrade;
grpc_set_header X-Request-ID $req_id;
grpc_set_header X-Real-IP $the_real_ip;
grpc_set_header X-Forwarded-For $the_real_ip;
grpc_set_header X-Forwarded-Host $best_http_host;
grpc_set_header X-Forwarded-Port $pass_port;
grpc_set_header X-Forwarded-Proto $pass_access_scheme;
grpc_set_header X-Original-URI $request_uri;
grpc_set_header X-Scheme $pass_access_scheme;
# Pass the original X-Forwarded-For
grpc_set_header X-Original-Forwarded-For $http_x_forwarded_for;
# mitigate HTTPoxy Vulnerability
# https://www.nginx.com/blog/mitigating-the-httpoxy-vulnerability-with-nginx/
grpc_set_header Proxy "";
# Custom headers to proxied server
proxy_connect_timeout 5s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
proxy_buffering off;
proxy_buffer_size 4k;
proxy_buffers 4 4k;
proxy_request_buffering on;
proxy_http_version 1.1;
proxy_cookie_domain off;
proxy_cookie_path off;
# In case of errors try the next upstream server before returning an error
proxy_next_upstream error timeout;
proxy_next_upstream_tries 3;
grpc_pass grpcs://upstream_balancer;
proxy_redirect off;
}
location /grpc.FantasticService {
set $namespace "master";
set $ingress_name "grpc-ingress";
set $service_name "fantastic-srv";
set $service_port "9999";
set $location_path "/grpc.FantasticService";
...
This is just an extract of the generated nginx.conf
. But you should be able to see how a single configuration could handle multiple services across multiple namespaces.
The last piece is a go snippet of how we get hold of the certificate via the context. As you can see from the config above, NGINX adds the authenticated cert and other details into the gRPC metadata.
meta, ok := metadata.FromIncomingContext(*ctx)
if !ok {
return status.Error(codes.Unauthenticated, "missing metadata")
}
// Check if SSL has been handled upstream
if len(meta.Get("ssl-client-verify")) == 1 && meta.Get("ssl-client-verify")[0] == "SUCCESS" {
if len(meta.Get("ssl-client-cert")) > 0 {
certPEM, err := url.QueryUnescape(meta.Get("ssl-client-cert")[0])
if err != nil {
return status.Errorf(codes.Unauthenticated, "bad or corrupt certificate")
}
block, _ := pem.Decode([]byte(certPEM))
if block == nil {
return status.Error(codes.Unauthenticated, "failed to parse certificate PEM")
}
cert, err := x509.ParseCertificate(block.Bytes)
if err != nil {
return status.Error(codes.Unauthenticated, "failed to parse certificate PEM")
}
return authUserFromCertificate(ctx, cert)
}
}
// if fallen through, then try to authenticate via the peer object for gRPCS,
// or via a JWT in the metadata for gRPC Gateway.