Say I have a ClickHouse cluster with 3 shards and I have distributed table and local table on each node. I have two sub queries q1 and q2 from distributed table. Can I instruct ClickHouse to perform join for the final result of q1 and q2?
In my understanding if I write query like below, the join happens on each node and the left table will be from the local table instead of distributed table.
with q1 as (select * from distributed_table ...) q2 as (select * from distributed_table ...) select * from q1 GLOBAL INNER JOIN q2 on(...)
The reason why I ask this is I have a use case where I need detect the sequence of events so I need use the global order of events when I join.
select * from q1 GLOBAL INNER JOIN q2
select * from (select * from q1) as q1' INNER JOIN q2
select * from q1 INNER JOIN q2 settings distributed_product_mode='allow'
select * from q1 INNER JOIN q2 settings distributed_product_mode='local'