referring to this question:
Finding duplicate values in multiple colums in a SQL table and count
I have the following table structure:
id name1 name2 name3 ...
1 Hans Peter Frank
2 Hans Frank Peter
3 Hans Peter Frank
4 Paul Peter Hans
.
.
.
I use the following command to display the duplicates and the counts:
SELECT COUNT(name1), name1, name2, name3
FROM table
GROUP BY name1, name2, name3
HAVING (COUNT(name1) > 1) AND (COUNT(name2) > 1) AND (COUNT(name3) > 1)
This command gives me a count of 2. I would like to know how the second line could also be counted as a dublicate.
Unfortunately, the solution to the original question (Finding duplicate values in multiple colums in a SQL table and count) does not work for char
First normalize the table with UNION ALL
in a CTE to get each of the 3 names in a separate row.
Then with ROW_NUMBER()
window function you can rank alphabetically the 3 names so that you can group by them:
WITH cte(id, name) AS (
SELECT id, name1 FROM tablename
UNION ALL
SELECT id, name2 FROM tablename
UNION ALL
SELECT id, name3 FROM tablename
)
SELECT COUNT(*) count, name1, name2, name3
FROM (
SELECT id,
MAX(CASE WHEN rn = 1 THEN name END) name1,
MAX(CASE WHEN rn = 2 THEN name END) name2,
MAX(CASE WHEN rn = 3 THEN name END) name3
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY id ORDER BY name) rn
FROM cte
)
GROUP BY id
)
GROUP BY name1, name2, name3
HAVING COUNT(*) > 1;
Another way to do it, that uses similar logic to your previous question with numeric values, with string function REPLACE()
instead of window functions, but works only if the 3 names in each row are different:
SELECT COUNT(*) count,
MIN(name1, name2, name3) name_1,
REPLACE(
REPLACE(
REPLACE(name1 || ',' || name2 || ',' || name3, MIN(name1, name2, name3), ''),
MAX(name1, name2, name3), ''), ',', ''
) name_2,
MAX(name1, name2, name3) name_3
FROM tablename
GROUP BY name_1, name_2, name_3
HAVING COUNT(*) > 1;
See the demo.