Trying to setup clickhouse-backup according to this manual https://github.com/AlexAkulov/clickhouse-backup/blob/master/Examples.md#how-to-use-clickhouse-backup-in-kubernetes
All look good to me but when I decided to wipe & restore to test - restore doesn't work.
Remote look like this:
clickhouse-backup list
2022-09-05T14-28-31 73.71KiB 05/09/2022 14:29:03 remote tar, regular
2022-09-05T23-47-29 541.83MiB 05/09/2022 23:48:46 remote tar, regular
2022-09-06T20-43-43 52.16MiB 06/09/2022 20:44:15 remote tar, regular
First 73.71KiB backup was made with wrong setup and was backing up only metadata (no full access to /var/lib/clickhouse).
Then I try this sequence of commands described in the same doc https://github.com/AlexAkulov/clickhouse-backup/blob/master/Examples.md#restore
My clickhouse configuration shardsCount: 1, replicasCount: 5
So I connect to pods and do All replicas
clickhouse-backup restore_remote --rm --schema 2022-09-05T23-47-29
clickhouse-backup delete local 2022-09-05T23-47-29
1st replica
clickhouse-backup restore_remote --rm 2022-09-05T23-47-29
clickhouse-backup delete local 2022-09-05T23-47-29
Then a number of warning like this arise:
2022/09/07 20:34:20.890442 info CREATE TABLE foo.bar (`project` String, `taskId` String, `addedAt` Nullable(DateTime('Europe/Copenhagen')), `metadata` String, `userId` String, `domain` String) ENGINE = ReplicatedMergeTree('/clickhouse/{installation}/{cluster}/tables/{shard}/foo/bar', '{replica}') PARTITION BY project ORDER BY (domain, project) SETTINGS index_granularity = 8192
2022/09/07 20:34:21.016010 warn can't create table 'foo.bar': code: 253, message: Replica /clickhouse/clickhouse/app-staging/tables/0/foo/bar/replicas/chi-clickhouse-app-staging-0-2 already exists, will try again backup=2022-09-05T23-47-29 operation=restore
clickhouse-backup tables
shows a number of 0B tables.
I'm not the person who created all the tables but I guess they worked fine before I started experimenting with backups. clickhouse-backup
seem to be popular so it should work. Would be nice to know what I'm missing.
Finally I figured out what I was doing wrong. While experimenting a lot at some point I set CLICKHOUSE_HOST
to CHI cluster service hostname referencing as load-balancer all replicas. When removing-creating databases it was done randomly on different replicas. CLICKHOUSE_HOST
in my case always should be localhost
. Working with zookeeper CLI helped a lot to figure this out.