Search code examples
sql-serversql-graph

How do you create a recursive Edge in SQL Graph tables?


Recently a customer asked how to create a Node with an Edge recursively pointing back to the same Node. The use case was around the concept of a Product "recommending" another Product. Here's a conceptual diagram.

enter image description here


Solution

  • In SQL Server's SQL Graph, any table can be attributed with one of the DDL extensions AS NODE or AS EDGE. When an Edge is created it is not directed or constrained, but with the new CONNECTION keyword, Edges can be constrained to-and-from only specified Nodes. Let's start with the Products table.

    CREATE TABLE Products
    (
      Id INT PRIMARY KEY
      , Name VARCHAR(50) NOT NULL
    ) AS NODE;
    

    This creates an empty table ready to be filled from your RDBMS database for querying with graph queries. You might, for example, want to ask "Does the recommendation chain of THIS product recursively EVER recommend THAT product?" That's a difficult question to ask with a standard TSQL query in any database. It's relatively simple in a graph database.

    Aside: This is where the idea of SQL Graph is interesting. If you have a single question well suited for a graph database, why migrate your data to a dedicated graph database and lose out on the capabilities of SQL Server around performance, scalability, high availability, interoperability, reporting, and support? SQL Graph lets you build a little graph right on top of an existing RDBMS structure without any of those potential compromises.

    Here's the magic.

    CREATE TABLE Recommends
    (
      CONSTRAINT EC_RECOMMENDS 
        CONNECTION (Product TO Product) 
          ON DELETE CASCADE
    ) AS EDGE
    

    This DDL statement lets you create the Edge you want for recommendation and adds a CONNECTION constraint to ensure the Edge can only be from a Product to a Product and no other Node can participate. Note: you could add ANOTHER constraint if you wanted to reuse this Edge with other Nodes.

    Now you can answer that question "Does the recommendation chain of THIS product recursively EVER recommend THAT product?" with query something like this:

    SELECT
      STRING_AGG(Product.name, '->') WITHIN GROUP (GRAPH PATH) AS RecommendationPath
    FROM
      Product AS Product,
      Recommends FOR PATH AS recommends,
      Product FOR PATH  AS Recommendation
    WHERE 
      MATCH(SHORTEST_PATH(Product(-(recommends)->Recommendation)+))
      AND Product.Id = 123 
      AND Recommendation.Id = 234
    

    There are several graph-specific functions built-in to TSQL today. For those missing you have a few interesting options: 1) write your own in TSQL. I have done this on several projects and find it uncommonly straight forward, depending on the algorithm, or 2) consider filtering a subset of the data suited for the algorithm you need and use SQL Server's ML Services capability to expose that data to whatever library your data scientists enjoy most. Having said that, the need to do #2, exporting your data, will be limited to SQL Managed Instance (in Azure) and is super-duper uncommon.