I'm wondering how will Clickhouse sort data internally if I use number as primary key. In documentation they say:
The inserted rows are stored on disk in lexicographical order (ascending) by the primary key columns
which would (maybe) imply that numbers are looked as strings and for example 10000 is lower / before 2, but when looking at their examples in the same documentation link, when they sort by UserID
field which is UInt32
, it looks like data is actually sorted by numbers (and not their lexicographical representation) so row with UserID
of value 240923 is before row with UserID
of value 10487847
My guess and expectation is that numbers (and other types like dates) are sorted as we would expect those types to be sorted so that order by
clause later on can use primary / sorting key but I just wanted to double check.
Thanks!
Your guess is correct: numbers are sorted by size, strings are sorted lexicographically...always ascending.