Search code examples
etcd

etcd2 in proxy mode doesn't do anything useful


I have an etcd cluster using TLS for security. I want other machines to use etcd proxy, so the localhost clients don't need to use TLS. Proxy is configured like this:

[Service]
Environment="ETCD_PROXY=on"
Environment="ETCD_INITIAL_CLUSTER=etcd1=https://master1.example.com:2380,etcd2=https://master2.example.com:2380"
Environment="ETCD_PEER_TRUSTED_CA_FILE=/etc/kubernetes/ssl/ca.pem"
Environment="ETCD_PEER_CERT_FILE=/etc/kubernetes/ssl/worker.pem"
Environment="ETCD_PEER_KEY_FILE=/etc/kubernetes/ssl/worker-key.pem"
Environment="ETCD_TRUSTED_CA_FILE=/etc/kubernetes/ssl/ca.pem"

And it works, as far as the first connection goes. But the etcd client does an initial query to discover the full list of servers, and then it performs its real query against one of the servers in that list:

$ etcdctl --debug ls
start to sync cluster using endpoints(http://127.0.0.1:4001,http://127.0.0.1:2379)
cURL Command: curl -X GET http://127.0.0.1:4001/v2/members
got endpoints(https://1.1.1.1:2379,https://1.1.1.2:2379) after sync
Cluster-Endpoints: https://1.1.1.1:2379, https://1.1.1.2:2379
cURL Command: curl -X GET https://1.1.1.1:2379/v2/keys/?quorum=false&recursive=false&sorted=false
cURL Command: curl -X GET https://1.1.1.2:2379/v2/keys/?quorum=false&recursive=false&sorted=false
Error:  client: etcd cluster is unavailable or misconfigured
error #0: x509: certificate signed by unknown authority
error #1: x509: certificate signed by unknown authority

If I change the etcd masters to --advertise-client-urls=http://localhost:2379, then the proxy will connect to itself and get into an infinite loop. And the proxy doesn't modify the traffic between the client and the master, so it doesn't rewrite the advertised client URLs.

I must not be understanding something, because the etcd proxy seems useless.


Solution

  • Turns out that most etcd clients (locksmith, flanneld, etc.) will work just fine with a proxy in this mode. It's only etcdctl that behaves differently. Because I was testing with etcdctl, I thought the proxy config wasn't working at all.

    1. If etcdctl is run with --skip-sync, then it will communicate through the proxy rather than retrieving the list of public endpoints.
    2. etcdctl cluster-health ignores --skip-sync and always touches the public etcd endpoints. It will never work with a proxy.