Search code examples
sql-servert-sqlsql-server-2017sql-graphsql-server-2017-graph

How to prevent insertion of cyclic reference in SQL


I have the following table:

create table dbo.Link
(
    FromNodeId int not null,
    ToNodeId int not null
)

Rows in this table represent links between nodes.

I want to prevent inserts or updates to this table from creating a cyclic relationship between nodes.

So if the table contains:

(1,2)
(2,3)

it should not be allowed to contain any of the following:

(1,1)
(2,1)
(3,1)

I'm happy to treat (1,1) separately (e.g. using a CHECK CONSTRAINT) if it makes the solution more straightforward.

I was thinking of creating an AFTER INSERT trigger with a recursive CTE (though there may be an easier way to do it).

Assuming this is the way to go, what would the trigger definition be? If there is a more elegant way, what is it?


Solution

  • Note first that it is preferable to detect cycles in another environment as recursive CTEs aren't known for their good performance and neither is a trigger that would run for each insert statement. For large graphs, a solution based on the solution below will likely be inefficient.


    Suppose you create the table as follows:

    CREATE TABLE dbo.lnk (
        node_from INT NOT NULL,
        node_to INT NOT NULL,
        CONSTRAINT CHK_self_link CHECK (node_from<>node_to),
        CONSTRAINT PK_lnk_node_from_node_to PRIMARY KEY(node_from,node_to)
    );
    

    That would block inserts with node_from equal to node_to, and for rows that already exist.

    The following trigger should detect cyclic references by throwing an exception if a cyclic reference is detected:

    CREATE TRIGGER TRG_no_circulars_on_lnk ON dbo.lnk AFTER INSERT
    AS
    BEGIN
        DECLARE @cd INT;
        WITH det_path AS (
            SELECT
                anchor=i.node_from,
                node_to=l.node_to,
                is_cycle=CASE WHEN i.node_from/*anchor*/=l.node_to THEN 1 ELSE 0 END
            FROM
                inserted AS i
                INNER JOIN dbo.lnk AS l ON
                    l.node_from=i.node_to
            UNION ALL
            SELECT
                dp.anchor,
                node_to=l.node_to,
                is_cycle=CASE WHEN dp.anchor=l.node_to THEN 1 ELSE 0 END
            FROM
                det_path AS dp
                INNER JOIN dbo.lnk AS l ON
                    l.node_from=dp.node_to
            WHERE
                dp.is_cycle=0
        )
        SELECT TOP 1
            @cd=is_cycle 
        FROM 
            det_path
        WHERE
            is_cycle=1
        OPTION 
            (MAXRECURSION 0);
    
        IF @cd IS NOT NULL 
            THROW 67890, 'Insert would cause cyclic reference', 1;
    END
    

    I tested this for a limited number of inserts.

    INSERT INTO dbo.lnk(node_from,node_to)VALUES(1,2); -- OK
    INSERT INTO dbo.lnk(node_from,node_to)VALUES(2,3); -- OK
    INSERT INTO dbo.lnk(node_from,node_to)VALUES(3,4); -- OK
    

    And

    INSERT INTO dbo.lnk(node_from,node_to)VALUES(2,3); -- PK violation
    INSERT INTO dbo.lnk(node_from,node_to)VALUES(1,1); -- Check constraint violation
    INSERT INTO dbo.lnk(node_from,node_to)VALUES(3,2); -- Exception: Insert would cause cyclic reference
    INSERT INTO dbo.lnk(node_from,node_to)VALUES(3,1); -- Exception: Insert would cause cyclic reference
    INSERT INTO dbo.lnk(node_from,node_to)VALUES(4,1); -- Exception: Insert would cause cyclic reference
    

    It also detects cyclic references already present in the inserted rows if inserting more than one row at once, or if a path longer than one edge would be introduced in the graph. Going off on the same initial inserts:

    INSERT INTO dbo.lnk(node_from,node_to)VALUES(8,9),(9,8);       -- Exception: Insert would cause cyclic reference
    INSERT INTO dbo.lnk(node_from,node_to)VALUES(4,5),(5,6),(6,1); -- Exception: Insert would cause cyclic reference