Search code examples
sqlsql-servergroup-byduplicateslarge-data

Selecting ONLY Duplicates from a joined tables query


I have the following query that I'm trying to join two tables matching their ID so I can get the duplicated values in "c.code". I've tried a lot of queries but nothing works. I have a 500k rows in my database and with this query I only get 5k back, which is not right. Im positive it's at least 200K. I also tried to use Excel but it's too much for it to handle. Any ideas? Thanks in advance, everyone.

SELECT c.code, c.name as SCT_Name, t.name as SYNONYM_Name, count(c.code)
FROM database.Terms as t
  join database.dbo.Concepts as c on c.ConceptId = t.ConceptId
  where t.TermTypeCode = 'SYNONYM' and t.ConceptTypeCode = 'NAME_Code' and c.retired = '0'
   Group by c.code, c.name, t.name
   HAVING COUNT(c.code) > = 1

Order by c.code

Solution

  • with data as (
        select c.code, c.name as SCT_Name, t.name as SYNONYM_Name
        from database.Terms as t inner join database.dbo.Concepts as c
            on c.ConceptId = t.ConceptId
        where
                t.TermTypeCode = 'SYNONYM'
            and t.ConceptTypeCode = 'NAME_Code'
            and c.retired = '0'
    )
    select *
        --, (select count(*) from data as d2 where d2.code = data.code) as code_count
        --, count(*) over (partition by code) as code_count
    from data
    where code in (select code from data group by code having count(*) > 1)
    order by code