I have a table with thousands of records:
+-----+-------+------+----------+
| id | group | code | label |
+-----+-------+------+----------+
| 1 | abc | 345 | myLabel1 |
| 2 | abc | 345 | myLabel4 |
| 3 | def | 987 | myLabel2 |
| 4 | def | 345 | myLabel4 |
| 5 | def | 987 | myLabel2 |
| 6 | def | 231 | myLabel1 |
| 7 | def | 231 | myLabel9 |
| 8 | abc | 345 | myLabel3 |
| 9 | def | 987 | myLabel2 |
| 10 | def | 345 | myLabel1 |
| ... | ... | ... | ... |
+-----+-------+------+----------+
In this table some entries are similar to each other.
For example 1,2 and 8: group
is "abc" and code
is "345".
Another example are entries 3, 5 and 9: group
is "def", code
is "987" and label
is "myLabel2".
Formally one could say:
IF group = "abc", THEN code = "345"
IF group = "def" AND code = "987", THEN label = "myLabel2 "
I want to run a program over my tables which finds those similarities.
PS: This table is just an example. It actually has more than 20 columns with different names.
This will give you duplicate rows in table with number of time occurrence
SELECT [group], [code] , COUNT(1) as CNT
FROM YourTableName
GROUP BY [group], [code]
HAVING COUNT(1) > 1
ORDER BY CNT;