I've found it so wired that CPU load of the server on which our ClickHouse deploys raises every 4 fours since 12.1, as there are no scheduled jobs. CPU Load during Last 2 Days
Then I found the value of ProfileEvent_Merge
and ProfileEvent_MergedRows
in system table metric_log
might illustrate the reason CPU load raise. The number of merges and number of merge rows have a similar trend as shown below.
Merged Rows Per Second during Last 90 Days
I followed the changes of system
.merges
at 4 p.m. today (Dec. 8th). Finally, I found that the collapse progress of two system tables, query_log
and query_thread_log
, takes about 200s. During collapse, CPU load raised definitely.
I'm using ClickHouse 21.11.4.14. I have 8 user tables, 1 using MergeTree engine, 3 using ReplacingMergeTree, and 4 using VersionedCollapsingMergeTree. The server has 2 cores, and the OS is CentOS 7.9 64 bit.
If there is anything concerned but I didn't mention it, or more information you want to know, reply please.
There are three things about merges I can't understand. 1.Why does the number of merges suddenly increase at the first day of a month, such as Oct. 1st, Nov. 1st and Dec. 1st, under the circumstances where there are no changes on configuration or on table structure. 2.Why does the number of merges increase every 4 fours. I do remember there is no rule for merging. 3.Why does the number of merges decrease gradually and generally during Oct., Nov. and Dec..
Could you please explain them for me? And what can I do to solve the CPU-load problem? Thanks very much for your reply.
This is DDL of system
.query_log
. I've modify TTL of some log tables in system
database to TTL event_time + toIntervalMonth(3)
, like query_log
, query_thread_log
, metric_log
and etc., in order to save disk space.
create table query_log
(
type Enum8('QueryStart' = 1, 'QueryFinish' = 2, 'ExceptionBeforeStart' = 3, 'ExceptionWhileProcessing' = 4),
event_date Date,
event_time DateTime,
event_time_microseconds DateTime64(6),
query_start_time DateTime,
query_start_time_microseconds DateTime64(6),
query_duration_ms UInt64,
read_rows UInt64,
read_bytes UInt64,
written_rows UInt64,
written_bytes UInt64,
result_rows UInt64,
result_bytes UInt64,
memory_usage UInt64,
current_database String,
query String,
formatted_query String,
normalized_query_hash UInt64,
query_kind LowCardinality(String),
databases Array(LowCardinality(String)),
tables Array(LowCardinality(String)),
columns Array(LowCardinality(String)),
projections Array(LowCardinality(String)),
views Array(LowCardinality(String)),
exception_code Int32,
exception String,
stack_trace String,
is_initial_query UInt8,
user String,
query_id String,
address IPv6,
port UInt16,
initial_user String,
initial_query_id String,
initial_address IPv6,
initial_port UInt16,
initial_query_start_time DateTime,
initial_query_start_time_microseconds DateTime64(6),
interface UInt8,
os_user String,
client_hostname String,
client_name String,
client_revision UInt32,
client_version_major UInt32,
client_version_minor UInt32,
client_version_patch UInt32,
http_method UInt8,
http_user_agent String,
http_referer String,
forwarded_for String,
quota_key String,
revision UInt32,
log_comment String,
thread_ids Array(UInt64),
ProfileEvents Map(String, UInt64),
Settings Map(String, String),
used_aggregate_functions Array(String),
used_aggregate_function_combinators Array(String),
used_database_engines Array(String),
used_data_type_families Array(String),
used_dictionaries Array(String),
used_formats Array(String),
used_functions Array(String),
used_storages Array(String),
used_table_functions Array(String),
`ProfileEvents.Names` Array(String) alias mapKeys(ProfileEvents),
`ProfileEvents.Values` Array(UInt64) alias mapValues(ProfileEvents),
`Settings.Names` Array(String) alias mapKeys(Settings),
`Settings.Values` Array(String) alias mapValues(Settings)
)
engine = MergeTree PARTITION BY toYYYYMM(event_date)
ORDER BY (event_date, event_time)
TTL event_time + toIntervalMonth(3)
SETTINGS index_granularity = 8192;
...
PARTITION BY toYYYYMM(event_date)
...
TTL event_time + toIntervalMonth(3)
The table is partitioned by month but TTL stores data for 3 months. So every 4 hours Clickhouse does a TTL Merge, it rewrites the oldest parts without data (expired rows) which is older than 3 months.
But instead of TTL re-merge, TTL is able to drop the whole partitions without reading/rewriting them. This drop operation is much lighter for CPU and disk I/O.
You can enable it using a mergetree setting ttl_only_drop_parts
, in this case data will be stored 4 months, because the all rows in the partition are expired after 4 months.
Another way is to change the partitioning to daily and set TTL rule accordingly to partitioning.
PARTITION BY toYYYYMMDD(event_date)
...
TTL toStartOfDay(event_time) + toIntervalMonth(3)
toStartOfDay
-- all rows in a part will be expired at the same moment of time.
toYYYYMMDD
-- daily partitions
https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-system-tables-eat-my-disk/
https://kb.altinity.com/altinity-kb-queries-and-syntax/ttl/
https://clickhouse.com/docs/en/operations/settings/settings/#ttl_only_drop_parts
https://clickhouse.com/docs/en/operations/settings/merge-tree-settings/