sql sql-server-2008 select subquery linked-server

Query not listing duplicate records in sql?

I am using below query to find duplicate records

SELECT F.ID,f.CIVILID
FROM DBO.EMPUDF AS F
WHERE f.civilid IN (SELECT CIVILID
                    FROM dBO.EMPUDF
                    GROUP BY CIVILID
                    HAVING COUNT (CIVILID) > 1)

The subquery returns 218 records which are duplicate records, but when I try to execute the query to return name and id it returns more than 10k records with many null id columns.

Solution

Well, you are dealing with duplicates, so there are a lot of them.

You can get an idea of how many are duplicated by doing a histogram of histograms:

SELECT cnt, COUNT(*), MIN(CIVILID), MAX(CIVILID)
FROM (SELECT CIVILID, COUNT(*) as cnt
      FROM  dBO.EMPUDF
      GROUP BY CIVILID
     ) c
GROUP BY cnt
ORDER BY cnt;

Also, your original query is more simply written using window functions:

SELECT e.*
FROM (SELECT e.*, COUNT(*) OVER (PARTITION BY CIVILID) as cnt
      FROM  dBO.EMPUDF
     ) e
WHERE cnt > 1;