Check SQL Server table values against themselves

Imagine I had this table:

declare @tmpResults table ( intItemId int, strTitle nvarchar(100), intWeight float )

insert into @tmpResults values (1, 'Item One', 7)
insert into @tmpResults values (2, 'Item One v1', 6)
insert into @tmpResults values (3, 'Item Two', 6)
insert into @tmpResults values (4, 'Item Two v1', 7)

And a function, which we'll call fn_Lev that takes two strings, compares them to one another and returns the number of differences between them as an integer (i.e. the Levenshtein distance).

What's the most efficient way to query that table, check the fn_Lev value of each strTitle against all the other strTitles in the table and delete rows are similar to one another by a Levenshtein distance of 3, preferring to keeping higher intWeights?

So the after the delete, @tmpResults should contain

1   Item One    7
4   Item Two v1 7

I can think of ways to do this, but nothing that isn't horribly slow (i.e iterative). I'm sure there's a faster way?

Cheers, Matt

Solution

SELECT strvalue= CASE 
                WHEN t1.intweight >= t2.intweight THEN t1.strtitle 
                ELSE t2.strtitle 
              END, 
       dist = Fn_lev(t1.strtitle, t2.strtitle) 
FROM   @tmpResults AS t1 
       INNER JOIN @tmpResults AS t2 
         ON t1.intitemid < t2.intitemid 
WHERE  Fn_lev(t1.strtitle, t2.strtitle) = 3

This will perform a self join that will match each row only once. It will excluding matching a row on itself or reverse of a previous match ie if A<->B is a match then B<->A isn't.

The case statement selects the highest weighted result