Search code examples
databasesynchronizationreplicationsql-deleteclickhouse

Clickhouse cluster replicas sync completion time issue


I'm trying to remove about 50 milliong of 40 billions rows table (one specific day based on event_date column) , I'm using lightweight delete with "where" clause and "where" has no columns that is part of ORDER BY of PRIMARY KEY so I need to scan whole table in order to find that data with specific "event_date". I know that I can partition table or create primary key or create table with specific "ORDER BY" , but in case mentioned above without any of mentioned warkarounds is there any way to end operation only when all replicas deleted their data ? When SQL operation exceed 120 sec on mask DELETE FROM ... ON CLUSTER SETTINGS alter_sync = 2 ,replication_wait_for_inactive_replica_timeout = 600 , mutations_sync = 2 I still get error

DB::Exception: Watching task /clickhouse/cluster_name/task_queue/ddl/query-0023446431 is executing longer than distributed_ddl_task_timeout (=180) seconds. There are 10 unfinished hosts (0 of them are currently active), they are going to execute the query in background.

I want to have sql running until deletes comleted on all replicas even it take infinite time , I don't want to break SQL operation after 180 sec ( I know it is still running in background) , how can I achieve this ?


Solution

  • You could use SETTINGS distributed_ddl_task_timeout=3600

    Because you have full scan in this case, better use ALTER TABLE .. DELETE WHERE ...

    and poll SELECT hostName(), * FROM clusterAllReplicas('cluster_name',system.mutations) WHERE is_done=0 FORMAT Vertical