Search code examples
sqlsql-serverduplicate-dataauto-generatedatabase-table

Generate connection table from table with duplicates


I have a SQL Server database table (DuplicateIds) that contains the ID's of duplicate words from another table (Words). Here's and example of the data in the table DuplicateIds:

        word_id  |  word
----------------------------------
        244      |  ape
        603      |  ape
       1873      |  ape
        372      |  banana
       3095      |  banana

...and so on. Often there are only two or three duplicates but there are cases with 10 or even more duplicates.

Now I want to use this table with duplicates to create a new table that connects the ID's of the words that are the same. I guess the new table would look something like this:

        word_id  |  connected_id
----------------------------------
        244      |    603
        244      |   1873
        603      |    244
        603      |   1873
       1873      |    244
       1873      |    603
        372      |   3095
       3095      |    372

With this table I can look up a certain word using its ID and get the ID's of all the words that are the same.

Now I wonder if I can write a (T)SQL statement that generates this new connection table for me, using the data from DuplicateIds?


Solution

  • This should do it:

    SELECT
       di.word_id
      ,di2.word_id  connected_id
     into NewTable
     from DuplicateIds di
      inner join DuplicateIds di2
       on di2.word = di.word
        and di2.word_id <> di.word_id