Search code examples
etcd

Is it safe to use etcd across multiple data centers?


Is it safe to use etcd across multiple data centers? As it expose etcd port to public internet. Do I have to use client certificates in this case or etcd has some sort of authification?


Solution

  • Yes, but there are two big issues you need to tackle:

    1. Security. This all depends on what type of info you are storing in etcd. Using a point to point VPN is probably preferred over exposing the entire cluster to the internet. Client certificates can also be used.

    2. Tuning. etcd relies on replication between machines for two things, aliveness and consensus. Since a successful write must be committed to at majority of the cluster before it returns as successful, your write performance will degrade as the distance between the machines increases. Aliveness is measured with periodic heartbeats between the machines. By default, etcd has a fairly aggressive 50ms heartbeat timeout, which is optimized for bare metal servers running on a local network. Without tuning this timeout value, your cluster will constantly think that members have disappeared and trigger frequent master elections. This gets worse if both of your environments are on cloud providers that have variable networks plus disk writes that traverse the network, a double whammy.

    More info on etcd tuning: https://etcd.io/docs/latest/tuning/