Search code examples
sqlpostgresqlrecursive-query

Recursive SQL query with links in PostgreSQL (migration from Oracle)


I have a table called NodeHierarchy

id name parentId linkedTargetId
1 A null null
2 B 1 null
3 C 2 null
4 D 2 null
5 E 3 null
6 F 4 null
7 G 3 null
8 H 6 3
9 I 6 null

visual representation

in Oracle i can use this query to get the information i need including following the links

select distinct pe.*
from (SELECT p.id
      FROM NodeHierarchy p
               JOIN (SELECT pi.PARENTID, pi.ID
                     FROM NodeHierarchy pi
                     UNION ALL
                     SELECT d.PARENTID, s.ID
                     FROM NodeHierarchy s
                              JOIN NodeHierarchy d ON d.ID = s.PARENTID) children ON children.ID = p.ID
      START WITH children.PARENTID = 6
      CONNECT BY NOCYCLE PRIOR p.linkedTargetId = children.PARENTID) tree,
     NodeHierarchy pe
where pe.PARENTID = tree.ID
   or pe.id = tree.id

so far i was not able to create a query in PostgreSQL that gives me the same result for each starting id, excluding the follow link part

WITH RECURSIVE tree AS (
    SELECT p.id, p.parentId, p.linkedTargetId
    FROM NodeHierarchy p
    WHERE p.id = 6

    UNION ALL

    SELECT child.id, child.parentId, child.linkedTargetId
    FROM NodeHierarchy child
             INNER JOIN tree t ON t.id = child.parentId
) cycle id set is_cycle using path
SELECT DISTINCT pe.*
FROM tree
         JOIN NodeHierarchy pe
              ON pe.parentId = tree.id ;

What i'm missing is the part of following the linkedTargetId and getting their children, can anybody pls help me?

The example does not contain a circular link chain, but in my real data it is also possible

Full SQL-Script for the example:

CREATE TABLE NodeHierarchy
(
    id             INT PRIMARY KEY,
    name           VARCHAR(50),
    parentId       INT,
    linkedTargetId INT,
    FOREIGN KEY (parentId) REFERENCES NodeHierarchy (id),
    FOREIGN KEY (linkedTargetId) REFERENCES NodeHierarchy (id)
);

INSERT INTO NodeHierarchy (id, name, parentId, linkedTargetId)
VALUES (1, 'A', NULL, NULL);
INSERT INTO NodeHierarchy (id, name, parentId, linkedTargetId)
VALUES (2, 'B', 1, NULL);
INSERT INTO NodeHierarchy (id, name, parentId, linkedTargetId)
VALUES (3, 'C', 2, NULL);
INSERT INTO NodeHierarchy (id, name, parentId, linkedTargetId)
VALUES (4, 'D', 2, NULL);
INSERT INTO NodeHierarchy (id, name, parentId, linkedTargetId)
VALUES (5, 'E', 3, NULL);
INSERT INTO NodeHierarchy (id, name, parentId, linkedTargetId)
VALUES (6, 'F', 4, NULL);
INSERT INTO NodeHierarchy (id, name, parentId, linkedTargetId)
VALUES (7, 'G', 3, NULL);
INSERT INTO NodeHierarchy (id, name, parentId, linkedTargetId)
VALUES (8, 'H', 6, 3);
INSERT INTO NodeHierarchy (id, name, parentId, linkedTargetId)
VALUES (9, 'I', 6, NULL);

Solution

  • If I understand the question correctly you want to link the current node with a node if:

    1. the parentId in that node matches the Id of the current node
    2. the Id of that node matches the linkedTargetId of the current node

    If that's the case then I believe this should work:

    WITH RECURSIVE tree AS (
        SELECT p.id, p.parentId, p.linkedTargetId
        FROM NodeHierarchy p
        WHERE p.id = 6
        
        UNION ALL
        
        SELECT child.id, child.parentId, child.linkedTargetId
        FROM NodeHierarchy child
                 INNER JOIN tree t ON t.id = child.parentId or t.linkedTargetId = child.id
    ) cycle id set is_cycle using path
    SELECT pe.*
    FROM tree
    JOIN NodeHierarchy pe
      ON pe.id = tree.id
    order by array_length(path,1);
    

    The recursive relation will look to join on either the parentId of the child node or the linkedTargetId of the current node.

    Hope this helps.