I'm hosting ClickHouse (v20.4.3.16) in 2 replicas on Kubernetes and it makes use of Zookeeper (v3.5.5) in 3 replicas (also hosted on the same Kubernetes cluster).
I would need to migrate the Zookeeper used by ClickHouse with another installation, still 3 replicas but v3.6.2.
What I tried to do was the following:
2021.01.13 13:03:36.454415 [ 135 ] {885576c1-832e-4ac6-82d8-45fbf33b7790} <Warning> default.check_in_availability: Tried to add obsolete part 202101_0_0_0 covered by 202101_0_1159_290 (state Committed)
and the new data is never inserted.
I've read all the info about Data Replication and Deduplication, but I am sure I'm adding new data in the insert, plus all tables make use of temporal fields (event_time or update_timestamp and so on) but it simply doesn't work.
Attaching ClickHouse back to the old Zookeeper, the problem is not happening with the same data inserted.
Is there something which needs to be done prior to change Zookeeper endpoints? Am I missing something obvious?
Using zk-shell, I
You cannot use this method because it does not copy autoincrement values which are used for part block numbers.
There are much simpler way. You can migrate ZK cluster by adding new ZK nodes as followers.
Here is a plan for ZK 3.4.9 (no dynamic reconfiguration): 1. Configure the 3 new ZK nodes as a cluster of 6 nodes (3 old + 3 new), start them. No changes needed for the 3 old ZK nodes at this time. The new server would not connect and download a snapshot, so I had to start one of them in the cluster of 4 nodes first. 2. Make sure the 3 new ZK nodes connected to the old ZK cluster as followers (run echo stat | nc localhost 2181 on the 3 new ZK nodes) 3. Confirm that the leader has 5 synced followers (run echo mntr | nc localhost 2181 on the leader, look for zk_synced_followers) 7. Remove the 3 old ZK nodes from zoo.cfg on the 3 new ZK nodes. 8. Stop data loading in CH (this is to minimize errors when CH loses ZK). 4. Change the zookeeper section in the configs on the CH nodes (remove the 3 old ZK servers, add the 3 new ZK servers) 5. Restart all CH nodes (CH must restart to connect to different ZK servers) 6. Make sure there are no connections from CH to the 3 old ZK nodes (run echo stat | nc localhost 2181 on the 3 old nodes, check their Client ssection). 11. Turn off the 3 old ZK nodes 9. Restart the 3 new ZK nodes. They should form a cluster of 3 nodes. 10. When CH reconnects to ZK, start data loading.