Search code examples
sqloracle-databasesubtraction

How to substract rows from one table from another only once


I'm working for a university project, and I have the following question: I have 2 tables in a Oracle DB... I need to select those rows from table1, which are not included in table2... But the main problem is that I need to exclude that rows from table2 wich was selected once... For example:

Table1                       Table2                      ResultTable
id  | Number | Letter        id  | Number | Letter       id  | Number | Letter
_____________________        _____________________       _____________________
1        4        S          1       6        G          2        2        P 
2        2        P          2       8        B          3        5        B
3        5        B          3       4        S          4        4        S
4        4        S          4       1        A          6        2        P
5        1        A          5       1        H
6        2        P          6       2        X

So, how you see it, if one row from Table1 has a "twin" in Table2, they both are excluded.


Solution

  • Probably the most thorough query is this:

    SELECT table1.id,
           table1.digit,
           table1.letter
      FROM ( SELECT id,
                    digit,
                    letter,
                    ROW_NUMBER() OVER (PARTITION BY digit, letter ORDER BY id) rn
               FROM table1
           ) table1
      LEFT
      JOIN ( SELECT id,
                    digit,
                    letter,
                    ROW_NUMBER() OVER (PARTITION BY digit, letter ORDER BY id) rn
               FROM table2
           ) table2
        ON table2.digit = table1.digit
       AND table2.letter = table1.letter
       AND table2.rn = table1.rn
     WHERE table2.id IS NULL
     ORDER
        BY table1.id
    ;
    

    which gives each record in table1 and table2 a "row number" within its group of "twins". For example, this:

    SELECT id,
           digit,
           letter,
           ROW_NUMBER() OVER (PARTITION BY digit, letter ORDER BY id) rn
      FROM table1
     ORDER
        BY table1.id
    ;
    

    returns this:

            ID      DIGIT LETT         RN
    ---------- ---------- ---- ----------
             1          4 S             1
             2          2 P             1
             3          5 B             1
             4          4 S             2     -- second row with 4 S
             5          1 A             1
             6          2 P             2     -- second row with 2 P
    

    That said, if you know that no (digit, letter) can ever appear more than once in table2, you can simplify this considerably by using EXISTS instead of ROW_NUMBER():

    SELECT id,
           digit,
           letter
      FROM table1 table1a
     WHERE EXISTS
            ( SELECT 1
                FROM table1
               WHERE digit = table1a.digit
                 AND letter = table1a.letter
                 AND id < table1a.id
            )
        OR NOT EXISTS
            ( SELECT 1
                FROM table2
               WHERE digit = table1a.digit
                 AND letter = table1a.letter
            )
    ;