Search code examples
rdata-structuresdata-manipulationcrosstabtwo-way-binding

Create count table under specific condition


I have a set of binary variables (with the values of 0 and 1) and I want to create a two-way count table that summarizes the counts of cooccurrence of pairs of variables (i.e., both of them have the value of 1). Here is an example dataset:

mm <- matrix(0, 5, 6)
m <- 2
n <- 2
df <- data.frame(apply(mm, c(1,2), function(x) sample(c(0,1),1)))
colnames(df) <- c("Horror", "Thriller", "Comedy", "Romantic", "Sci.fi", "gender")

In the end, I would like to have the table that counts the cooccurrence of Horror(=1) and gender(=1), Thriller(=1) and gender(=1), Comedy(=1) and gender(=1), Romantic(=1) and gender(=1), and sci.fi(=1) and gender(=1).


Solution

  • Something like this?

    library(dplyr)
    df %>% 
      mutate(across(-gender, ~ifelse(.==1 & gender ==1, 1, 0), .names = "{col}_gender1" )) %>% 
      summarise(across(ends_with("gender1"), sum))
    
      Horror_gender1 Thriller_gender1 Comedy_gender1 Romantic_gender1 Sci.fi_gender1
    1              1                3              2                1              0