Search code examples

Converting a dataframe into a long format contingency table for network analysis purposes

This is probably very basic (sorry), but I'm a beginner in coding and haven't been able to find instructions online. I use R and have a dataframe like this:

emot<-data.frame(id, happy, sad, angry, excited)
  id happy sad angry excited
  1     1   0     0       1
  2     0   0     1       0
  3     1   0     0       1
  4     0   1     0       1

id signifies a person and the other variables signify whether the person mentioned a certain emotion (1) or not (0). I'd like to convert the dataframe to obtain this result:

source target  count
happy  sad      0
happy  angry    0
happy  excited  2
sad    angry    0
sad    excited  1
angry  excited  0

I really tried with the table function, but to no avail. Thank you in advance!


  • In base R you could do something like:

    combo <- combn(names(emot)[-1], 2) |>
      setNames(c("target", "source")) |>
      transform(count = sapply(asplit(combo, 2), \(x) sum(rowMeans(emot[,x]) == 1L)))

    Basically, get all the combinations of your column names, excluding id, and then iterate to retrieve those columns and apply some logic.


      target  source count
    1  happy     sad     0
    2  happy   angry     0
    3  happy excited     2
    4    sad   angry     0
    5    sad excited     1
    6  angry excited     0