We're doing a LOT of work towards trying to reconcile about 1,000 duplicate manufacturer names and 1,000,000 duplicate part numbers. One thing that has come up is how to "match" things like "Limited" vs. "Ltd." vs. "Ltd"
The purpose is for the application to reconcile these matched items into a standard format. So:
ACME Ltd. ACME Limited ACME Ltd
Should all be reconciled into ACME Ltd.
This will also be used to prevent entering additional duplicates in the future.
Any suggestions on how to accomplish this pattern matching in SQL Server? Any known algorithms to find items with mapped equivalencies, etc...?
Thanks!
Eric.
How about a table that lists what you want in one column and variations in the next?
Ltd Limited
Ltd Ltd.
St Street
St Str.
Then, if you find a match on the second column, you change it to the first. It may take several iterations, as you find other alternatives.