So, I basically have two tables.
| id | marketGroupID |
|--------|---------------|
| 1 | 134 |
| 2 | 868 |
| 3 | 387 |
second table:
| marketGroupid | parentGroupID |
|---------------|---------------|
| 134 | 5987 |
| 5987 | NULL |
| 868 | 8796 |
| 8796 | 387 |
| 387 | NULL |
The end result should look like this:
| id | marketGroupID | parentGroupID |
|--------|---------------|---------------|
| 1 | 134 | 5987 |
| 2 | 868 | 387 |
| 3 | 387 | 387 |
Now I know how to INNER JOIN the both tables via
SELECT table1.id,table1.marketGroupID,table2.parentGroupID FROM 'table1' INNER JOIN 'table2' ON table1.marketGroupID=table2.marketGroupID;
Now the problem here is that this would only show the immediate parent. But I want the root parent. I don't know how many nodes will be in between the root parent and the child. All I know is, that the root parent is reached once parentGroupID is NULL. The root parent might already be the marketGroupID as is the case with id=3 or there might be 3 nodes in between as with id=2.
The point here is that I do not want the intermediate steps. The first answer in the "Duplicate" does exactly that. Furthermore does it assume a maximum number of intermediate steps. For each step there is another left join. I also don't know on which level the root parent will end up, since I don't know how many intermediate steps I have. So I would have to traverse each row from left to right until I meet null and that the last value before that.
The answers that use cte's assume, just one table. But I have too tables. And I still only want the root parent, nothing in between.
Technically I don't even need the original market group id. For each id in table one I just need the root parentgroupid and that is it.
This is a classic case for using a recursive CTE.
Start with one that generates a set of rows for each id in the first table that iterates through all the parent chains:
WITH RECURSIVE parents(id, child, parent) AS
(SELECT id, NULL, marketGroupID FROM table1
UNION ALL
SELECT p.id, p.parent, t2.parentGroupID
FROM table2 AS t2
JOIN parents AS p ON p.parent = t2.marketGroupID)
SELECT * FROM parents;
id child parent
---------- ---------- ----------
1 134
2 868
3 387
1 134 5987
2 868 8796
3 387
1 5987
2 8796 387
2 387
In these results, the rows with a null child column are the base group ids, the rows with values in both columns are intermediates in the chain of links to the root, and the rows with a null parent column have the final root groupid for each row of table1
. It's this last set of rows we're interested in. Adding in tracking the original child groups and filtering out the non-root rows gives the final results:
WITH RECURSIVE parents(id, market, child, parent) AS
(SELECT id, marketGroupID, NULL, marketGroupID FROM table1
UNION ALL
SELECT p.id, p.market, p.parent, t2.parentGroupID
FROM table2 AS t2
JOIN parents AS p ON p.parent = t2.marketGroupID)
SELECT id, market AS marketGroupID, child AS parentGroupID
FROM parents
WHERE parent IS NULL
ORDER BY id;
id marketGroupId parentGroupID
---------- ------------- -------------
1 134 5987
2 868 387
3 387 387
Technically I don't even need the original market group id. For each id in table one I just need the root parentgroupid and that is it.
WITH RECURSIVE parents(id, child, parent) AS
(SELECT id, NULL, marketGroupID FROM table1
UNION ALL
SELECT p.id, p.parent, t2.parentGroupID
FROM table2 AS t2
JOIN parents AS p ON p.parent = t2.marketGroupID)
SELECT id, child AS parentGroupID FROM parents WHERE parent IS NULL ORDER BY ID;