Let's say we have this table data (see at the end for a ready-ot-use request to create it):
+--+-----+-----+--------+
|id|name |color|shape |
+--+-----+-----+--------+
|1 |john |blue |square |
|2 |mary |green|square |
|3 |anna |red |triangle|
|4 |bob |blue |square |
|5 |susan|blue |square |
|6 |frank|red |triangle|
+--+-----+-----+--------+
With this request, it's possible to group rows by color and shape, and more importantly to add aggregation information:
SELECT
GROUP_CONCAT(name) AS names,
color,
shape,
COUNT(*) AS nb_duplicates
FROM temp_users
GROUP BY color, shape;
Result:
+--------------+-----+--------+-------------+
|names |color|shape |nb_duplicates|
+--------------+-----+--------+-------------+
|john,bob,susan|blue |square |3 |
|mary |green|square |1 |
|anna,frank |red |triangle|2 |
+--------------+-----+--------+-------------+
But how is it possible to "ungroup" the rows, in order to have:
nb_duplicates
and a unique group id (maybe auto-incremented) ?+--+-----+-----+--------+-------------+------------------+
|id|name |color|shape |nb_duplicates|duplicate_group_id|
+--+-----+-----+--------+-------------+------------------+
|1 |john |blue |square |3 |1 |
|2 |mary |green|square |1 |2 |
|3 |anna |red |triangle|2 |3 |
|4 |bob |blue |square |3 |1 |
|5 |susan|blue |square |3 |1 |
|6 |frank|red |triangle|2 |3 |
+--+-----+-----+--------+-------------+------------------+
I found a similar question Mysql: count, group by yet return all results, I tried the proposition:
SELECT
u.*,
dups.nb_duplicates
FROM temp_users u
INNER JOIN (
SELECT
u2.color,
u2.shape,
COUNT(*) AS nb_duplicates
FROM temp_users u2
GROUP BY color, shape
) AS dups ON u.color = dups.color AND u.shape = dups.shape;
But I got this error from MySql:
[HY000][1137] Can't reopen table: 'u'
Just for those who want to quicky reproduce the table:
DROP TABLE IF EXISTS temp_users;
CREATE TEMPORARY TABLE temp_users (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(20),
color VARCHAR(20),
shape VARCHAR(20)
);
INSERT INTO temp_users(name, color, shape) VALUES
('john', 'blue', 'square'),
('mary', 'green', 'square'),
('anna', 'red', 'triangle'),
('bob', 'blue', 'square'),
('susan', 'blue', 'square'),
('frank', 'red', 'triangle');
SELECT *,
COUNT(*) OVER (PARTITION BY color, shape) nb_duplicates,
DENSE_RANK() OVER (ORDER BY color, shape) duplicate_group_id
FROM temp_users
ORDER BY id;
id | name | color | shape | nb_duplicates | duplicate_group_id |
---|---|---|---|---|---|
1 | john | blue | square | 3 | 1 |
2 | mary | green | square | 1 | 2 |
3 | anna | red | triangle | 2 | 3 |
4 | bob | blue | square | 3 | 1 |
5 | susan | blue | square | 3 | 1 |
6 | frank | red | triangle | 2 | 3 |