Search code examples
sql-servergraphcluster-analysiscoefficients

Clustering Coefficient using SQL Server/C#


I have two tables in SQL Server i.e.
one table is GraphNodes as:

---------------------------------------------------------
id | Node_ID | Node            |  Node_Label | Node_Type
---------------------------------------------------------
1    677       Nuno Vasconcelos   Author       1
2    1359      Peng Shi           Author       1
3    6242      Z. Q. Shi          Author       1
4    8318      Kiyoung Choi       Author       1
5    12405     Johan A. K.        Author       1
6    26615     Tzung-Pei Hong     Author       1
7    30559     Luca Benini        Author       1  
...  
...  

and other table is GraphEdges as:

-----------------------------------------------------------------------------------------
id | Source_Node | Source_Node_Type | Target_Node | Target_Node_Type | Year |  Edge_Type
-----------------------------------------------------------------------------------------
1    1             1                  10965         2                  2005    1
2    1             1                  10179         2                  2007    1
3    1             1                  10965         2                  2007    1
4    1             1                  19741         2                  2007    1
5    1             1                  10965         2                  2009    1
6    1             1                  4816          2                  2011    1
7    1             1                  5155          2                  2011    1  
...  
...

I also have two tables i.e. GraphNodeTypes as:

-------------------------
id | Node    | Node_Type
-------------------------
1    Author    1
2    CoAuthor  2
3    Venue     3
4    Paper     4  

and GraphEdgeTypes as:

-------------------------------
id | Edge          | Edge_Type
-------------------------------
1    AuthorCoAuthor  1
2    CoAuthorVenue   2
3    AuthorVenue     3
4    PaperVenue      4
5    AuthorPaper     5
6    CoAuthorPaper   6  

Now, I want to calculate clustering coefficient for this graph i.e of two types:
If N(V) is # of links b/w neighbors of node V and K(V) is degree of node V then,

Local Clustering Coefficient(V) = 2 * N(V)/K(V) [K(V) - 1]  

and

Global Clustering Coefficient = 3 * # of Triangles / # of connected Triplets of V  

The questions is, how can I calculate degree of a node? Is it possible in SQL Server or C# programming required. And also please suggest hints for calculating Local and Global CCs as well.

Thanks!


Solution

  • The degree of a node is not "calculated". It's simply the number of edges this node has.

    While you can try to do this in SQL, the performance will likely be mediocre. Such type of analysis is commonly done in specialized databases and, if possible, in memory.