Search code examples
clickhouse

restoring a Replicated table in clickhouse from S3 backup


my current set up is 3 replicated nodes managed by zookeper (ch 23.3) I am trying to back up the tables i have on CH to s3 by doing the following:

table:

CREATE OR REPLACE TABLE test2 on cluster 'x'(
id UInt64,
  updated_at DateTime DEFAULT now(),
  updated_at_date Date DEFAULT toDate(updated_at)
)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{shard}/test2', '{replica}')
ORDER BY id;

backup:

BACKUP TABLE db.test2 on cluster 'x' to S3('https://backup/test2-base', '', '')

when I run this restore command I get the following error:

 restore TABLE db.test2 AS test2 on cluster 'x' from S3('https://backup/test2-base', '', '')

Error:

DB::Exception: Got error from chi%2Dclickhouse%2Dch%2Dcluster%2D0%2D1:9000. DB::Exception: Replica /clickhouse/tables/0/test2/replicas/chi-clickhouse-x-0-1 already exists: While creating table default.test2. (REPLICA_ALREADY_EXISTS)

I tried to detach the table but i face the same issue, i tried to rename the table as well but with no luck:

 restore TABLE db.test2 AS test2_recovered on cluster 'x' from S3('https://backup/test2-base', '', '')

how to delete this path? and is it stored in CH server or in Zookeeper


Solution

  • chatted with the Eng and Support teams about this. A few answers:

    The system can often take some time (up to 8 minutes?) to garbage collect old entries (could use the ZooKeeper SYNC command to force this action). Not using explicit paths by the table name can be effective, or using {uuid} in the path, or as mentioned above use unique paths in the keeper config