I thought I'd made the column userid
in my table "userslive"
unique, but somehow must have made a mistake. I've seen multiple answers to this question, but I'm afraid of messing up again so I hope someone can help me directly.
So this table has no unique columns, but I've got a column "timer" which was the timestamp of scraping the data. If possible I'd like to drop rows with the lowest "timer"
with duplicate "userid"
column.
It's a fairly big table at about 2 million
rows (20 columns
). There is about 1000
duplicate userid
which I've found using this query:
SELECT userid, COUNT(userid) as cnt FROM userslive GROUP BY userid HAVING (cnt > 1);
Is this the correct syntax? I tried this on a backup table, but I suspect this is too heavy for a table this big (unless left to run for a very long time.
DELETE FROM userslive using userslive,
userslive e1
where userslive.timer < e1.timer
and userslive.userid = e1.userid
Is there a quicker way to do this?
EDIT: I should say the "timer" is not a unique column.
DELETE t1.* /* delete from a copy named t1 only */
FROM userslive t1, userslive t2
WHERE t1.userid = t2.userid
AND t1.timer < t2.timer
Logic: if for some record (in a copy aliased as t1) we can find a record (in a table copy aliased as t2) with the same user but with greater/later timer value - this record must be deleted.