I have a table named transactions
(alias txs
for short) that contains 15 million rows PARTITIONED BY txs.year
for data accumulated over last 10-years (approx. 1 to 1.5-million rows/year). Source of this data is a MySQL database and the only data that changes is the data for current year
. My strategy is to setup a daily CRON job that exports all records for current year
in a compressed CSV format (i.e. 20XX-txs.csv.gz
) and then use AWS Glue/PySpark to convert it to snappy.parquet
format PARTITIONED BY txs.year
.
I've read that you can easily DROP PARTITIONS
with ClickHouse (reference). One thing throws me off without any further explanation. Their documentation states:
Deletes the specified partition from the table. This query tags the partition as inactive and deletes data completely, approximately in 10 minutes.
What I am wondering is :
INSERT
updated data from the newly created snappy.parquet
partition immediately after DROPPING
the stale partition for current_year
or do I have to wait the full 10 minutes prior to doing so?Example use case:
# STEP 1: Get updated data for current_year
# -----------------------------------------
$ wget https://s3.amazonaws.com/xxx.xxx/2021-txs.snappy.parquet
# STEP 2: Drop existing PARTITION for current_year
# -----------------------------------------
$ clickhouse-client --query="ALTER TABLE txs DROP PARTITION '2021'"
# STEP 3: INSERT updated data for current_year into the table
# -----------------------------------------
$ cat 2021-txs.snappy.parquet | clickhouse-client --query="INSERT INTO txs FORMAT Parquet"
It's irrelevant in your use-case.
It's about removing data from disk not about a table. (Users sometimes are worrying about disk freeing)
This query tags the partition as inactive and deletes data completely, approximately in 10 minutes.
That is funny side-effect of merges.
Active parts removed immediately with DROP PARTITION but inactive are not.
create table t(a Int64) Engine=MergeTree order by a; insert into t values (1); select name, active from system.parts where table = 't'; ┌─name──────┬─active─┐ │ all_1_1_0 │ 1 │ └───────────┴────────┘ optimize table t final; -- forced merge produced the new part all_1_1_1 (active) and left the old -- part all_1_1_0 (inactive). select name, active from system.parts where table = 't'; ┌─name──────┬─active─┐ │ all_1_1_0 │ 0 │ │ all_1_1_1 │ 1 │ └───────────┴────────┘ alter table t drop partition tuple(); -- DATA IS DROPPED -- but only active parts select count() from t; ┌─count()─┐ │ 0 │ └─────────┘ -- but inactive parts are still in waiting for -- background process to drop them after 8 minutes inactivity select name, active from system.parts where table = 't'; ┌─name──────┬─active─┐ │ all_1_1_0 │ 0 │ └───────────┴────────┘ ls -1 /var/lib/clickhouse/data/default/t/ all_1_1_0 detached format_version.txt
10 - minutes, actually 8 (480s) is controlled by merge_tree setting old_parts_lifetime
google translate: Inactive parts are not deleted immediately, because when writing a new chunk, fsync
is not called, i.e. for some time the new piece is located only in the server's RAM (OS cache). So if the server is rebooted spontaneously, new just merged parts can be lost or damaged. Then ClickHouse during the startup process is checking the integrity of the parts, can detect a problem, return the inactive parts to the active list, and later merge them again. Then the broken parts is renamed (the prefix broken is added) and moved to the detached folder. If the integrity check detects no problems in the merged chunk, then the original inactive chunks are renamed (prefix ignored is added) and moved to the detached folder.