Search code examples
combinationsabapfactorialopensql

Compare duplicates for 4 fields in open SQL


I want to compare if there are duplicates across 4 fields in open SQL.

Scenario: User has 4 fields to input. First name (N1), last name (N2), additional first name (N3) and additional last name (N4).

Right now the algorithm works this way: It concatenates N1 + N2 + % and then also N2+ N1 + %. So if the user inputs in any of the fields, the query looks for N1N2% or N2N1%. This mean for 2 fields, there are 2! combinations possible. Now with 2 additional fields, this algorithm explodes as there will be 4! combinations to check. Any ideas how to tackle this?

Note: We do this kind of combination check because the user could input data in any of those given input field. So we check for all combination of fields. Unfortunately, this cannot be changed.

EDIT: I cannot assume the order as it was previously designed in such a way. Hence, the complications with combinations.

Edit2: I like the idea of checking individual parts. But what we want to do is ideally concatenate all strings together and check for a substring in DB. In open-sql its done using the like statement. Our DB table has such concatenated string already stored for N1+N2 combination. This needs to be extended for 4 fields now.


Solution

  • The key to your problem is checking all name parts individually with leading and trailing '%' and check the total size of the db entry against the sum of the name parts:

    field = ('%' + N1 + '%') AND field = ('%' + N2 + '%') AND field = ('%' + N3 + '%') AND field = ('%' + N4 + '%') AND LENGTH(field) = LENGTH(N1+N2+N3+N4)

    This will find a match. You could use it to SELECT a normalized concatenation of the names and use GROUP BY and HAVING count(*)>1 to search for duplicates.