Search code examples
sqlclickhouse

ClickHouse Distributed Table Slow


I have created a distributed table in ClickHouse with two shards and one replica of the data in total.

SELECT *
FROM system.clusters

┌─cluster──────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name────┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┐
│ twoshardsonereplicas │         1 │            1 │           1 │ 192.168.8.70 │ 192.168.8.70 │ 9000 │        0 │ default │                  │
│ twoshardsonereplicas │         2 │            1 │           1 │ 192.168.8.71 │ 192.168.8.71 │ 9000 │        1 │ default │                  │
└──────────────────────┴───────────┴──────────────┴─────────────┴──────────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┘

I have the tables exit and exit_all which is the distributed table. I insert into the table exit_all and it distributes the data evenly across my two servers. When I query from the local exit table on half the records it takes 0.15, seconds but when doing on exit_all it takes 0.45 seconds. I would expect it to take around the same amount of time since the processing is split.

I can see when I count my query that the data is split perfectly for this query or within 100 records on 100k so it's not like all the data is sitting on one server.

Edit: Also getting large variances in query time every 2-3 queries where for some, it will go from 1.3 seconds down to 750ms for the same query.


Solution

  • To answer the first question:

    Distributed tables can add some amount of overhead to processing. Since you are doing two queries, passing the result via https and merging them, the last two parts can well add 0.3 seconds of processing, simply passing the data from one machine to another (in a local network) via http can take well over 0.1 seconds.

    Tl;dr You're distributed query is still taking ~0.15 seconds, transferring and merging the result takes more time however

    To answer the second question: Clickhouse has internal query caching, so repeating a query might or might not trigger data withdrawal from the cache making it faster. Hence why you are noticing inconsistencies with query time.