Search code examples
sql-servert-sqlgraphsql-server-2008-r2sql-graph

SQL Server 2008R2 from Table to edges for Graph


I have stumbled upon an interesting challenge. I have data in a SQL Server table with the following format/content.

Date     | Name
---------+---------
1/1/2010 | John
1/1/2010 | Mark
1/1/2010 | Peter
1/1/2010 | Mia
2/4/2010 | John
2/4/2010 | Billy

I am trying to convert that table into a file containing edges of a graph.

I'll need the edges file to have to columns and all the combinations that the table shows.

John | Mark
John | Peter
John | Mia
Mark | Mia
Mark | Peter
Peter | Mia
John | Billy

I suspect part of this can be achieved with pivot/unpivot but don't know how to proceed with limiting the pivot to only two columns.

Also, I don't know how to make sure I get all the possible combinations of nodes, see that the first four 'nodes' need to become six 'edges.'


Solution

  • You could use ROW_NUMBER and "triangle join":

    WITH cte AS
    (
      SELECT *, rn = ROW_NUMBER() OVER(PARTITION BY date ORDER BY Name)
      FROM tab
    )
    SELECT c.Name, c2.Name
    FROM cte c
    JOIN cte c2
      ON c.Date = c2.Date
      AND c.rn < c2.rn;
    

    LiveDemo

    Output:

    ╔═══════╦═══════╗
    ║ Name  ║ Name  ║
    ╠═══════╬═══════╣
    ║ John  ║ Mark  ║
    ║ John  ║ Mia   ║
    ║ John  ║ Peter ║
    ║ Mark  ║ Mia   ║
    ║ Mark  ║ Peter ║
    ║ Mia   ║ Peter ║      -- ORDER BY based on `Name`
    ║ Billy ║ John  ║      -- same here `B` before `J`
    ╚═══════╩═══════╝
    

    Note:

    To get stable sort you need to add column that will indicate order within group with the same date. I used Name but it swaps the names in last two rows.


    Version with ID column:

    CREATE TABLE tab(ID INT IDENTITY(1,1)
                     ,Date DATE  NOT NULL 
                     ,Name VARCHAR(6) NOT NULL);
    
    INSERT INTO tab(Date,Name) 
    VALUES ('1/1/2010','John'), ('1/1/2010','Mark'), ('1/1/2010','Peter')
          ,('1/1/2010','Mia'), ('2/4/2010','John'),('2/4/2010','Billy');
    
    WITH cte AS
    (
      SELECT *, rn = ROW_NUMBER() OVER(PARTITION BY date ORDER BY ID)
      FROM tab
    )
    SELECT c.Name, c2.Name
    FROM cte c
    JOIN cte c2
      ON c.Date = c2.Date
      AND c.rn < c2.rn;
    

    LiveDemo 2

    Output:

    ╔═══════╦═══════╗
    ║ Name  ║ Name  ║
    ╠═══════╬═══════╣
    ║ John  ║ Mark  ║
    ║ John  ║ Peter ║
    ║ John  ║ Mia   ║
    ║ Mark  ║ Peter ║
    ║ Mark  ║ Mia   ║
    ║ Peter ║ Mia   ║
    ║ John  ║ Billy ║
    ╚═══════╩═══════╝