Search code examples
rigraphadjacency-matrix

Building and adjacency matrix


I was wondering if you guys can help me building an adjacency matrix. I have data in CVS format like this:

Paper_ID    Author
2   Foster-McGregor, N.
3   Van Houte, M.
4   van de Meerendonk, A.
5   Farla, K.
6   van Houte, M.
6   Siegel, M.
8   Farla, K.
11  Farla, K.
11  Verspagen, B.

As you can see the column "Paper_ID" has a repeated value of 11, meaning that "Farla, K." and "Verspagen, B." are coauthors of a publication. I need to build a square weighted matrix using the names of the authors, counting the times that they are collaborating together.


Solution

  • Does the following do what you are looking for?

    # simulate data.
    d <- data.frame(
      id=c(2,3,4,5,6,6,8,11,11,12,12),
      author=c("FN", "VM","VA","FK","VM","SM","FK","FK","VB","FK","VB")
    )
    
    d
       id author
    1   2     FN
    2   3     VM
    3   4     VA
    4   5     FK
    5   6     VM
    6   6     SM
    7   8     FK
    8  11     FK
    9  11     VB
    10 12     FK
    11 12     VB
    
    # create incidence matrix:
    m <- xtabs(~author+id,d)
    m
          id
    author 2 3 4 5 6 8 11 12
        FK 0 0 0 1 0 1  1  1
        FN 1 0 0 0 0 0  0  0
        SM 0 0 0 0 1 0  0  0
        VA 0 0 1 0 0 0  0  0
        VB 0 0 0 0 0 0  1  1
        VM 0 1 0 0 1 0  0  0
    
    # convert to adjacency matrix.
    # tcrossprod does "m %*% t(m)"
    tcrossprod(m)
          author
    author FK FN SM VA VB VM
        FK  4  0  0  0  2  0
        FN  0  1  0  0  0  0
        SM  0  0  1  0  0  1
        VA  0  0  0  1  0  0
        VB  2  0  0  0  2  0
        VM  0  0  1  0  0  2
    

    Note that crossprod() will give you the incidence matrix for the id variable (i.e. will do t(m) %*% m).