I have a 3 node ClickHouse cluster
<secret>clickhousecluster</secret>
<shard>
<replica>
<host>CHN1</host>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<host>CHN2</host>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<host>CHN3</host>
<port>9000</port>
</replica>
</shard>
I created a distributed table:
columns:
`timestamp` DateTime CODEC(Delta(4), ZSTD(1)),
`host` LowCardinality(String),`message` String ,
`severity` LowCardinality(String),`message` String
create table db1.hosts_distr on
cluster cluster1 as db1.hosts engine = Distributed(cluster1, db1, hosts, if(severity = 'INFO', 1, 3));
This if(severity = 'INFO', 1, 3)) doesn't work either. It just puts all records on 1 node. What I need is to send all records with severity INFO to 2 nodes and all the rest to 1 node. Is it possible?
Thanks in advance.
when you insert data into distirbuted table
than data will slit in temporary .bin files with Native format, by sharding_expression % number_of_shards in cluster
so proper sharding expression should be in your case
Distributed(cluster1, db1, hosts, if(severity = 'INFO', 1, rand() % 2));
and you will need to change % 2
to % N
when you add more shards
if your db1.hosts
have engine=ReplicatedMergeTree
then you also need add <internal_replication>true</internal_replication>
into each <shard>