Search code examples
sqlsql-servermany-to-manysparse-matrix

How to match variable data in SQL Server


I need to map a many-to-many relationship between two flat tables. Table A contains a list of possible configurations (where each column is a question and the cell value is the answer). NULL values denote that the answer does not matter. Table B contains actual configurations with the same columns.

Ultimately, I need the final results to show which configurations are mapped between table B and A:

Example 

ActualId | ConfigId
---------+---------
    5    |  1
    6    |  1
    8    |  2
    .    |  .
    .    |  .
    .    |  .
    N    |  M 

To give a simple example of the tables and data I'm working with, the first table would look like such:

Table A
--------
ConfigId | Size | Color | Cylinders | ... | ColumnN
---------+------+-------+-----------+-----+--------
    1    |  3   |       |    4      | ... |   5   
    2    |  4   |   5   |    5      | ... |   5
    3    |      |   5   |           | ... |   5

And Table B would look like this:

Table B
-------
ActualId | Size | Color | Cylinders | ... | ColumnN
---------+------+-------+-----------+-----+--------
    1    |  3   |  1    |    4      | ... |    5
    2    |  3   |  8    |    4      | ... |    5
    3    |  4   |  5    |    5      | ... |    5
    4    |  7   |  5    |    6      | ... |    5

Since the NULL values denote that any value can work, the expected result would be:

Expected      
---------
ActualId | ConfigId
---------+---------
   1     |    1
   2     |    1
   3     |    2
   3     |    3
   4     |    3

I'm trying to figure out the best way to go about matching the actual data which has over a hundred columns. I know trying to check each and every column for NULL values is absolutely wrong and will not perform well. I'm really fascinated with this problem and would love some help to find the best way to tackle this.


Solution

  • So, this joins table a on size, color and cylinders.

    The size match will be A against B: If A.SIZE is null, the compare will B.SIZE=B.SIZE which will always return true. If A.SIZE is not null, the compare will be A.SIZE=B.SIZE which will only be true if they match.

    The matching on color and cylinders are similar.

    SELECT * FROM TABLEA A
        INNER JOIN TABLEB B ON ISNULL(A.SIZE, B.SIZE)=B.SIZE
            AND ISNULL(A.COLOR, B.COLOR)=B.COLOR
            AND ISNULL(A.CYLINDERS, B.CYLINDERS)=B.CYLINDERS