I have the following table:
create table dbo.Link
(
FromNodeId int not null,
ToNodeId int not null
)
Rows in this table represent links between nodes.
I want to prevent inserts or updates to this table from creating a cyclic relationship between nodes.
So if the table contains:
(1,2)
(2,3)
it should not be allowed to contain any of the following:
(1,1)
(2,1)
(3,1)
I'm happy to treat (1,1) separately (e.g. using a CHECK CONSTRAINT) if it makes the solution more straightforward.
I was thinking of creating an AFTER INSERT trigger with a recursive CTE (though there may be an easier way to do it).
Assuming this is the way to go, what would the trigger definition be? If there is a more elegant way, what is it?
Note first that it is preferable to detect cycles in another environment as recursive CTEs aren't known for their good performance and neither is a trigger that would run for each insert statement. For large graphs, a solution based on the solution below will likely be inefficient.
Suppose you create the table as follows:
CREATE TABLE dbo.lnk (
node_from INT NOT NULL,
node_to INT NOT NULL,
CONSTRAINT CHK_self_link CHECK (node_from<>node_to),
CONSTRAINT PK_lnk_node_from_node_to PRIMARY KEY(node_from,node_to)
);
That would block inserts with node_from
equal to node_to
, and for rows that already exist.
The following trigger should detect cyclic references by throwing an exception if a cyclic reference is detected:
CREATE TRIGGER TRG_no_circulars_on_lnk ON dbo.lnk AFTER INSERT
AS
BEGIN
DECLARE @cd INT;
WITH det_path AS (
SELECT
anchor=i.node_from,
node_to=l.node_to,
is_cycle=CASE WHEN i.node_from/*anchor*/=l.node_to THEN 1 ELSE 0 END
FROM
inserted AS i
INNER JOIN dbo.lnk AS l ON
l.node_from=i.node_to
UNION ALL
SELECT
dp.anchor,
node_to=l.node_to,
is_cycle=CASE WHEN dp.anchor=l.node_to THEN 1 ELSE 0 END
FROM
det_path AS dp
INNER JOIN dbo.lnk AS l ON
l.node_from=dp.node_to
WHERE
dp.is_cycle=0
)
SELECT TOP 1
@cd=is_cycle
FROM
det_path
WHERE
is_cycle=1
OPTION
(MAXRECURSION 0);
IF @cd IS NOT NULL
THROW 67890, 'Insert would cause cyclic reference', 1;
END
I tested this for a limited number of inserts.
INSERT INTO dbo.lnk(node_from,node_to)VALUES(1,2); -- OK
INSERT INTO dbo.lnk(node_from,node_to)VALUES(2,3); -- OK
INSERT INTO dbo.lnk(node_from,node_to)VALUES(3,4); -- OK
And
INSERT INTO dbo.lnk(node_from,node_to)VALUES(2,3); -- PK violation
INSERT INTO dbo.lnk(node_from,node_to)VALUES(1,1); -- Check constraint violation
INSERT INTO dbo.lnk(node_from,node_to)VALUES(3,2); -- Exception: Insert would cause cyclic reference
INSERT INTO dbo.lnk(node_from,node_to)VALUES(3,1); -- Exception: Insert would cause cyclic reference
INSERT INTO dbo.lnk(node_from,node_to)VALUES(4,1); -- Exception: Insert would cause cyclic reference
It also detects cyclic references already present in the inserted rows if inserting more than one row at once, or if a path longer than one edge would be introduced in the graph. Going off on the same initial inserts:
INSERT INTO dbo.lnk(node_from,node_to)VALUES(8,9),(9,8); -- Exception: Insert would cause cyclic reference
INSERT INTO dbo.lnk(node_from,node_to)VALUES(4,5),(5,6),(6,1); -- Exception: Insert would cause cyclic reference