Search code examples
sqloracle-databasejoin

How can I join two tables with condition in the joining variables Oracle Database


I am working in an Oracle Database. Suppose I have the next table (table 1):

contract key1 key2
contract1 abcdefghijklmn NULL
contract2 NULL uhbgnd

And the following table (table 2):

key key_value
abcdefghijklmn 30
1234uhbgnd 4

I want to join both of them by key of table2. As you can see, in table 1 I have the keys separated in two columns. In this case, what I want is to JOIN first by t2.key = t1.key1 and, if Key 1 is NULL, then the JOIN I would like is SUBSTR(t2.key, 5) = t1.key2.

In my mind, the JOIN would look something like this:

SELECT
    t1.contract,
    t2.key,
    t2.key_value
FROM
    table1 t1
    INNER JOIN table2 t2
    ON CASE 
        WHEN t1.key1 IS NOT NULL THEN t2.key = t1.key1
        ELSE SUBSTR(t2.key, 5) = t1.key2

Thus, the desirable output would be:

contract key value
contract1 abcdefghijklmn 30
contract2 1234uhbgnd 4

However, I cannot make this work. Can you help me?


Solution

  • Any answer involving a CASE/DECODE or set of OR conditions in a join is going to run into performance problems at large data volumes (unless Oracle rewrites the OR as a set of union, called "or-expansion". It cannot ever, however, rewrite a CASE statement).

    You are far better off joining twice so each join can be a simple equijoin:

    SELECT
        t1.contract,
        COALESCE(t2a.key,t2b.key) key,
        COALESCE(t2a.key_value,t2b.key_value) key_value
    FROM
        table1 t1
        LEFT OUTER JOIN table2 t2a ON t2a.key = t1.key1
        LEFT OUTER JOIN table2 t2b ON SUBSTR(t2b.key, 5) = t1.key2
    

    This assumes that SUBSTR(t2.key,5) is unique. If it is not, you would want to disable one of these joins to prevent them both from succeeding (as two one-to-many joins will give you a cartesian product). You can use conditional logic to do this, but keep it on the inner table columns only:

    FROM
        table1 t1
        LEFT OUTER JOIN table2 t2a ON t2a.key = DECODE(t1.key2,NULL,t1.key1)
        LEFT OUTER JOIN table2 t2b ON SUBSTR(t2b.key, 5) = DECODE(t1.key1,NULL,t1.key2)
    

    or if both can be non-null, pick one to favor:

    FROM
        table1 t1
        LEFT OUTER JOIN table2 t2a ON t2a.key = t1.key1
        LEFT OUTER JOIN table2 t2b ON SUBSTR(t2b.key, 5) = DECODE(t1.key1,NULL,t1.key2)
    

    These are still straight equijoins. The important thing is that while conditional logic on either one side or the other is acceptable, making the join operator itself (=) conditional or the whole clause conditional or referring to both tables within the same conditional clause or CASE statement creates significant problems for the optimizer. This is of course an issue only for joins; it is not a problem within the SELECT clause.