I have a list of names in my table as follows:
paul-jones
Ron-Thomas
John-Doe
Michael-Adams
Jim-Jones
Adam-Jones
Bob-Thomas
Bill-Thomas
I would like to delete all rows where last name after hypen is duplicate but would like to keep oldest row for each last name. For example in the above data set I would like to delete
Jim-Jones
Adam-Jones
Bob-Thomas
Bill-Thomas
and keep
paul-jones
Ron-Thomas
John-Doe
Michael-Adams
and so on.
I found the below query to extract all charecter after hyphen like this
select SUBSTRING_INDEX(full_name,'-',-1) from names;
but couldn't figure out how to update last_name column in my table with these values, so that I can delete based on unique values in my last_name column using this query maybe
("ALTER IGNORE TABLE names ADD UNIQUE (`last_name`)") ;
I hope this clearly explains my question. Thanks for the help.
You can use the SUBSTRING_INDEX
in your UPDATE query to update last_name
.
UPDATE product
SET last_name = SUBSTRING_INDEX(supplier_reference,'-',-1)
WHERE SUBSTRING_INDEX(supplier_reference,'-',-1) != ''
AND last_name = '';
WHERE
clause will ensure that the query updates last name only if there is a value after hyphen.
And then, if you want to delete duplicate rows but retain oldest one based on last_name
,
DELETE FROM product
WHERE id NOT IN (
SELECT MIN(id)
FROM product
GROUP BY last_name
)
If you have auto-incremented value id
then MIN(id)
will ensure oldest record is retained. To test, instead of DELETE
try SELECT * FROM product
and verify if these are the records you want deleted.
Please note, this query will also delete duplicate rows where last_name
is empty. If you don't want that, then add a clause WHERE last_name != ''
in the subquery.