I have a table like this:
id | cluster_id | user_id | name | ...
1 | 1 | 1 | test name
2 | 1 | 3 | other
3 | null | 1 | one more
4 | 2 | 1 | foo
5 | null | 1 | bar
6 | 1 | 1 | baz
I'd like to create a query that groups by the cluster_id
column but only group by the columns that have a not null value so that I get something like this:
id | cluster_id | user_id | ...
1 | 1 | 1 | test name
3 | null | 1 | one more
4 | 2 | 1 | foo
5 | null | 1 | bar
I'd like to a list with distinct cluster_ids, but only where the cluster_id is not null. Also I'd like to filter on arbitrary columns like user_id
.
In the above result I also query on the user_id
, where the user_id
is 1.
How do I create such a query?
Thanks in advance!
The query is simple.
The GROUP BY also works with NULL Values
I made two queries, the first includes the user_id the last not
You must test this with id as primary key, and see if excluding the NULL brings some performance
CREATE TABLE tab1 ( `id` INTEGER, `cluster_id` int, `user_id` INTEGER, `name` VARCHAR(20) );
INSERT INTO tab1 (`id`, `cluster_id`, `user_id`, `name`) VALUES ('1', '1', '1', 'test name'), ('2', '1', '3', 'other'), ('3', null, '1', 'one more'), ('4', '2', '1', 'foo'), ('5', null, '1', 'bar'), ('6', '1', '1', 'baz');
SELECT * FROM tab1 WHERE `id` IN (SELECT MIN(`id`) FROM tab1 GROUP BY `cluster_id`,`user_id`) UNION SELECT * FROM tab1 WHERE `cluster_id` IS NULL
id | cluster_id | user_id | name -: | ---------: | ------: | :-------- 1 | 1 | 1 | test name 2 | 1 | 3 | other 3 | null | 1 | one more 4 | 2 | 1 | foo 5 | null | 1 | bar
SELECT * FROM tab1 WHERE `id` IN (SELECT MIN(`id`) FROM tab1 WHERE `cluster_id` IS NOT NULL GROUP BY `cluster_id`,`user_id`) UNION SELECT * FROM tab1 WHERE `cluster_id` IS NULL
id | cluster_id | user_id | name -: | ---------: | ------: | :-------- 1 | 1 | 1 | test name 2 | 1 | 3 | other 4 | 2 | 1 | foo 3 | null | 1 | one more 5 | null | 1 | bar
SELECT * FROM tab1 WHERE `id` IN (SELECT MIN(`id`) FROM tab1 GROUP BY `cluster_id`) UNION SELECT * FROM tab1 WHERE `cluster_id` IS NULL
id | cluster_id | user_id | name -: | ---------: | ------: | :-------- 1 | 1 | 1 | test name 3 | null | 1 | one more 4 | 2 | 1 | foo 5 | null | 1 | bar
SELECT * FROM tab1 WHERE `id` IN (SELECT MIN(`id`) FROM tab1 WHERE `cluster_id` IS NOT NULL GROUP BY `cluster_id`) UNION SELECT * FROM tab1 WHERE `cluster_id` IS NULL
id | cluster_id | user_id | name -: | ---------: | ------: | :-------- 1 | 1 | 1 | test name 4 | 2 | 1 | foo 3 | null | 1 | one more 5 | null | 1 | bar
db<>fiddle here