Yes, you can find similar questions numerous times, but: the most elegant solutions posted here, work for SQL Server, but not for Sybase (in my case Sybase Anywhere 11). I have even found some Sybase-related questions marked as duplicates for SQL Server questions, which doesn't help.
One example for solutions I liked, but didn't work, is the WITH ... DELETE ...
construct.
I have found working solutions using cursors or while-loops, but I hope it is possible without loops.
I hope for a nice, simple and fast query, just deleting all but one exact duplicate.
Here a little framework for testing:
IF OBJECT_ID( 'tempdb..#TestTable' ) IS NOT NULL
DROP TABLE #TestTable;
CREATE TABLE #TestTable (Column1 varchar(1), Column2 int);
INSERT INTO #TestTable VALUES ('A', 1);
INSERT INTO #TestTable VALUES ('A', 1); -- duplicate
INSERT INTO #TestTable VALUES ('A', 1); -- duplicate
INSERT INTO #TestTable VALUES ('A', 2);
INSERT INTO #TestTable VALUES ('B', 1);
INSERT INTO #TestTable VALUES ('B', 2);
INSERT INTO #TestTable VALUES ('B', 2); -- duplicate
INSERT INTO #TestTable VALUES ('C', 1);
INSERT INTO #TestTable VALUES ('C', 2);
SELECT * FROM #TestTable ORDER BY Column1,Column2;
DELETE <your solution here>
SELECT * FROM #TestTable ORDER BY Column1,Column2;
Ok, now that I know the ROWID()
function, solutions for tables with primary key (PK) can be easily adopted. This one first selects all rows to keep and then deletes the remaining ones:
DELETE FROM #TestTable
FROM #TestTable
LEFT OUTER JOIN (
SELECT MIN(ROWID(#TestTable)) rowid
FROM #TestTable
GROUP BY Column1, Column2
) AS KeepRows ON ROWID(#TestTable) = KeepRows.rowid
WHERE KeepRows.rowid IS NULL;
...or how about this shorter variant? I like!
DELETE FROM #TestTable
WHERE ROWID(#TestTable) NOT IN (
SELECT MIN(ROWID(#TestTable))
FROM #TestTable
GROUP BY Column1, Column2
);
In this post, which inspired me most, is a comment that NOT IN
might be slower. But that's for SQL server, and sometimes elegance is more important :) - I also think it all depends on good indexes.
Anyway, usually it is bad design, to have tables without a PK. You should at least add an "autoinc" ID, and if you do, you can use that ID instead of the ROWID()
function, which is a non-standard extension by Sybase (some others have it, too).