Search code examples
sqliteduplicatesrelational

sql database delete duplicate relationships


I have a simple database table with over 5000 records. Each record says for example 'john' is related to 'jack'. Almost every relationship is likely to be in the database twice because there could be a record 'jack' is related to 'john', which expresses the same relationship. How can I identify and delete these kinds of 'duplicate' records?


Solution

  • To delete such records, you could use exists logic which checks for the presence of the same pair in the reverse order. Assuming arbitrarily that you want to retain the pair with the lexicographically lowest name first, consider:

    DELETE
    FROM yourTable
    WHERE EXISTS (SELECT 1 FROM yourTable t2
                  WHERE t2.name2 = yourTable.name1 AND
                        t2.name1 = yourTable.name2) AND
          name1 >= name2;
    

    This assumes that your table has 2 columns name1 and name2 which store the relationships. For example, the following data:

    name1 | name2
    Jack  | Henry
    Henry | Jack
    Paul  | Revere
    

    would become:

    name1 | name2
    Henry | Jack
    Paul  | Revere