Search code examples
databasesqlitetreeinner-join

inner join root of tree to table


So, I basically have two tables.

|   id   | marketGroupID |
|--------|---------------|
|   1    |      134      |
|   2    |      868      |
|   3    |      387      |

second table:

| marketGroupid | parentGroupID |
|---------------|---------------|
|   134         |      5987     |
|   5987        |      NULL     |
|   868         |      8796     |
|   8796        |      387      |
|   387         |      NULL     |

The end result should look like this:

|   id   | marketGroupID | parentGroupID |
|--------|---------------|---------------|
|   1    |      134      |     5987      |
|   2    |      868      |     387       |
|   3    |      387      |     387       |

Now I know how to INNER JOIN the both tables via

SELECT table1.id,table1.marketGroupID,table2.parentGroupID FROM 'table1' INNER JOIN 'table2' ON table1.marketGroupID=table2.marketGroupID;

Now the problem here is that this would only show the immediate parent. But I want the root parent. I don't know how many nodes will be in between the root parent and the child. All I know is, that the root parent is reached once parentGroupID is NULL. The root parent might already be the marketGroupID as is the case with id=3 or there might be 3 nodes in between as with id=2.

The point here is that I do not want the intermediate steps. The first answer in the "Duplicate" does exactly that. Furthermore does it assume a maximum number of intermediate steps. For each step there is another left join. I also don't know on which level the root parent will end up, since I don't know how many intermediate steps I have. So I would have to traverse each row from left to right until I meet null and that the last value before that.

The answers that use cte's assume, just one table. But I have too tables. And I still only want the root parent, nothing in between.

Technically I don't even need the original market group id. For each id in table one I just need the root parentgroupid and that is it.


Solution

  • This is a classic case for using a recursive CTE.

    Start with one that generates a set of rows for each id in the first table that iterates through all the parent chains:

    WITH RECURSIVE parents(id, child, parent) AS
      (SELECT id, NULL, marketGroupID FROM table1
      UNION ALL
       SELECT p.id, p.parent, t2.parentGroupID
       FROM table2 AS t2
       JOIN parents AS p ON p.parent = t2.marketGroupID)
    SELECT * FROM parents;
    id          child       parent
    ----------  ----------  ----------
    1                       134
    2                       868
    3                       387
    1           134         5987
    2           868         8796
    3           387
    1           5987
    2           8796        387
    2           387
    

    In these results, the rows with a null child column are the base group ids, the rows with values in both columns are intermediates in the chain of links to the root, and the rows with a null parent column have the final root groupid for each row of table1. It's this last set of rows we're interested in. Adding in tracking the original child groups and filtering out the non-root rows gives the final results:

    WITH RECURSIVE parents(id, market, child, parent) AS
      (SELECT id, marketGroupID, NULL, marketGroupID FROM table1
      UNION ALL
       SELECT p.id, p.market, p.parent, t2.parentGroupID
       FROM table2 AS t2
       JOIN parents AS p ON p.parent = t2.marketGroupID)
    SELECT id, market AS marketGroupID, child AS parentGroupID
    FROM parents
    WHERE parent IS NULL
    ORDER BY id;
    id          marketGroupId  parentGroupID
    ----------  -------------  -------------
    1           134            5987
    2           868            387
    3           387            387
    

    Technically I don't even need the original market group id. For each id in table one I just need the root parentgroupid and that is it.

    WITH RECURSIVE parents(id, child, parent) AS
      (SELECT id, NULL, marketGroupID FROM table1
      UNION ALL
       SELECT p.id, p.parent, t2.parentGroupID
       FROM table2 AS t2
       JOIN parents AS p ON p.parent = t2.marketGroupID)
    SELECT id, child AS parentGroupID FROM parents WHERE parent IS NULL ORDER BY ID;