Search code examples
sqlsql-serverrdbms

Creating a VIEW to get a connection count


I have a table below which stores Connections between 2 person

TABLE (CONNECTION)

ID | REQUEST_PERSON | REQUESTEE_PERSON

I would like to build a VIEW which gets the REQUEST_PERSON, REQUESTEE_PERSON and MUTUAL_CONNECTION_COUNT(other common connections count between them). Any help is appreciated

For Example if we have a table data as below

ID | REQUEST_PERSON | REQUESTEE_PERSON
1    A                B
2    A                C
3    B                C
4    D                B
5    D                A
6    A                E
7    B                E
8    A                F
9    C                G

I need a VIEW display below

ID | REQUEST_PERSON | REQUESTEE_PERSON  | MUTUAL_CONNECTION_COUNT
1    A                B                   3
2    A                C                   1
3    B                C                   1
4    D                B                   1
5    D                A                   1
6    A                E                   1
7    B                E                   1
8    A                F                   0
9    C                G                   0

Solution

  • This is rather tricky. Here is code that does what you want:

    select c.*,
           (select count(*)
            from (select v.person2
                  from connections c2 cross apply
                       (values (c2.REQUESTEE_PERSON, c2.REQUEST_PERSON), (c2.REQUEST_PERSON, c2.REQUESTEE_PERSON)
                       ) v(person1, person2)
                  where v.person1 IN (c.Request_Person, c.Requestee_Person) 
                  group by v.person2
                  having count(*) = 2
                 ) v
           ) in_common    
    from connections c
    order by id;
    

    Here is a SQL Fiddle.

    The essence of the problem is finding people who are connected to both people in each row. Your connections are unidirectional, which makes the logic hard to express -- C could be either the first or second person in either connection.

    Arrgh!

    So, the innermost subquery adds reverse links to the graph. Then, it can focus on filtering by the first person -- who has to match the persons in the outer query. The second person is the one that might be in common.

    The inner aggregation is just summarizing by the second person. It filters using having count(*) = 2 to indicate that both people in the outer query need to be connected to the second person in the inner query. The count(*) assumes that you have no duplicates.

    Then, these are counted, which is the value you want.