Search code examples
radjacency-matrix

Need to set cutoffs before creating an adjacency matrix


This is small part of the data set I have:

      Winner    Player 1    Player 2    Player 3
       Susan    Archie      Heck         Jay
       Archie   Brown       Susan        Jay
       Heck     Archie      Jay          Brown
       Jay      Brown       Archie       Susan
       Brown    Susan       Archie       Jay
       Archie   Brown       Susan        Heck
       Susan    Heck        Jay          Brown
       Jay      Heck        Susan        Brown
       Susan    Archie      Heck         Brown
       Lee      Susan       Jay          Heck
       Kyle     Heck        Jay          Susan

I used the following code to convert this into an adjacency matrix:

   d = read.csv("res.csv")
   lvs <- sort(as.character(unique(unlist(d))))
   d[] <- lapply(d, factor, levels = lvs)
   res <- table(d[c("Player.1","Winner")]) + 
   table(d[c("Player.2","Winner")]) + 
   table(d[c("Player.3","Winner")])  
   diag(res) <- 0

What I need to do is set cutoffs. So the only people who should be included in the matrix are players who have played at least 2 matches against each other.

The output should be an adjacency matrix, with only players who have played each other at least twice. So, the original matrix looks like this:

          Winner    Susan   Archie  Heck    Jay     Brown   Lee     Kyle
          Susan       0       2      0       2         1     1       1
          Archie      2       0      1       1         1     0       0
          Heck        3       1      0       1         0     1       1
          Jay         2       1      1       0         1     1       1
          Brown       2       2      1       2         0     0       0
          Lee         0       0      0       0         0     0       0
          Kyle        0       0      0       0         0     0       0

But after it eliminates players who are only matched up once, the resulting matrix is the following:

          Winner    Susan   Archie  Heck    Jay     Brown   Lee     Kyle
          Susan       0       2      0       2         1     0       0
          Archie      2       0      1       1         1     0       0
          Heck        3       1      0       1         0     0       0
          Jay         2       1      1       0         1     0       0
          Brown       2       2      0       2         0     0       0
          Lee         0       0      0       0         0     0       0
          Kyle        0       0      0       0         0     0       0

Solution

  • We can do this more easily with gathering into 'long' format

    library(tidyverse)
    out <- gather(d, key, val, -Winner) %>% 
              select(-key) %>%
              mutate(val = factor(val, levels = lvs)) %>% 
              table %>% 
              t
    

    and then setting the columns to 0 values for the Player rows that are 0

    out[, names(which(!rowSums(out)))] <- 0
    

    data

    d <- structure(list(Winner = structure(c(7L, 1L, 3L, 4L, 2L, 1L, 7L, 
    4L, 7L, 6L, 5L), .Label = c("Archie", "Brown", "Heck", "Jay", 
    "Kyle", "Lee", "Susan"), class = "factor"), Player1 = structure(c(1L, 
    2L, 1L, 2L, 7L, 2L, 3L, 3L, 1L, 7L, 3L), .Label = c("Archie", 
    "Brown", "Heck", "Jay", "Kyle", "Lee", "Susan"), class = "factor"), 
        Player2 = structure(c(3L, 7L, 4L, 1L, 1L, 7L, 4L, 7L, 3L, 
        4L, 4L), .Label = c("Archie", "Brown", "Heck", "Jay", "Kyle", 
        "Lee", "Susan"), class = "factor"), Player3 = structure(c(4L, 
        4L, 2L, 7L, 4L, 3L, 2L, 2L, 2L, 3L, 7L), .Label = c("Archie", 
        "Brown", "Heck", "Jay", "Kyle", "Lee", "Susan"), 
     class = "factor")), row.names = c(NA, 
    -11L), class = "data.frame")