Search code examples
rdata-manipulationdata-cleaningsocial-networkingdata-conversion

Convert longitudinal data to edgelist and nodelist


I have some longitudinal data which I want to convert to a directed edgelist and a nodelist. The edgelist has a weight column which shows the counts of let change from one day to the next. Here are the longitudinal data:

longdata <- data.frame(    id = c(1L,1L,1L,1L,1L,2L,2L,2L,
                                  2L,2L,3L,3L,3L,3L,3L),
                           day = c(1L,2L,3L,4L,5L,1L,2L,3L,
                                   4L,5L,1L,2L,3L,4L,5L),
                           let = c("o","s","s","o","s","a",
                                   "b","a","b","a","c","c","d","d","c"),
                           gender = c("F","F","F","F","F","M",
                                      "M","M","M","M", "F","F","F","F","F"),
                           age = c(23L,23L,23L,23L,23L,31L,
                                   31L,31L,31L,31L,28L,28L,28L,28L,28L)
)

Here is the edgelist I want to obtain:

from to weight
  o  o      0
  o  s      2
  s  o      1
  s  s      1
  a  a      0
  a  b      2
  b  a      2
  b  b      0
  c  c      1
  c  d      1
  d  c      1
  d  d      1

Here is the nodelist I want to obtain:

id gender age
   1      F  23
   2      M  31
   3      F  28

Solution

  • longdata %>%
       group_by(id) %>%
       transmute(from = factor(let, unique(let)), to = lead(from)) %>%
       count(from, to) %>%
       na.omit()
    
    # A tibble: 9 x 4
    # Groups:   id [3]
         id from  to        n
      <int> <fct> <fct> <int>
    1     1 o     s         2
    2     1 s     o         1
    3     1 s     s         1
    4     2 a     b         2
    5     2 b     a         2
    6     3 c     c         1
    7     3 c     d         1
    8     3 d     c         1
    9     3 d     d         1
    

    distinct(longdata, id, gender, age)
      id gender age
    1  1      F  23
    2  2      M  31
    3  3      F  28