I have thousands of rows in a table. Some rows have similar keywords but can be categorized in the same group. For example:
Table : Birds_Name
+-------+---------------------+
|ID |Name |
+-------+---------------------+
|1 |Blue Peckwood |
+-------+---------------------+
|2 |North Peckwood |
+-------+---------------------+
|3 |Northern Peckwood |
+-------+---------------------+
|4 |Northern Peckwood |
+-------+---------------------+
|5 |Red Heron |
+-------+---------------------+
|6 |Red Heron |
+-------+---------------------+
As for the table above there should be 2 groups of birds. They are Peckwood and Heron.
But after I run this mySQL I get :
SELECT *
FROM birds_name
WHERE name IN (
SELECT name
FROM birds_name
GROUP BY name
HAVING COUNT(*) > 1
)
After I run the query. This is what I've got:
+-------+---------------------+
|3 |Northern Peckwood |
+-------+---------------------+
|4 |Northern Peckwood |
+-------+---------------------+
|5 |Red Heron |
+-------+---------------------+
|6 |Red Heron |
+-------+---------------------+
Actually, I expect any row which share a similar string to be chosen (in this case it's Peckwood. So it should have only 2 groups - Peckwood and Heron.
Is it possible to do so? And how to adapt mysql code to achieve it?
Regards.
Try this
SELECT SUBSTRING_INDEX(name,' ',-1),count(*)
FROM birds_name
GROUP BY SUBSTRING_INDEX(name,' ',-1) HAVING count(*)>0;
Manual for SUBSTRING_INDEX function in mysql.