Search code examples
rreshape

table of proportions in R taking into account unique persons


I have 3 people in a study. They are asked to pick up as many items of fruit as they wish. I then want to count the times each fruit was picked up and create a proportion using the number of participants as the denominator.

I have a participant id and the name of the fruit arranged in a table as below:

id<-c("a","a","a","b","b","c","c","c","c")
fruit<-c("apple","pear","orange","apple","grapes","apple","pear","orange","grapefruit")
data<-data.frame(id,fruit, stringsAsFactors = FALSE)

#I would normally approach the problem using the tabyl function
janitor::tabyl(data,fruit)

The percentage includes all the values in the denominator which isnt what I want. It says 33% of people chose the apple, whereas the % I need is that 100% of the people chose the apple, 33% chose the grapes etc.

Can anyone suggest any code that calculates the percentage of each fruit using the number of participants as the denominator?


Solution

  • dplyr:

    library(dplyr)
    data |> 
      summarise(
        n = n(), 
        percent = n() / n_distinct(data$id) * 100,
        .by = fruit
      )
    
    #        fruit n   percent
    # 1      apple 3 100.00000
    # 2       pear 2  66.66667
    # 3     orange 2  66.66667
    # 4     grapes 1  33.33333
    # 5 grapefruit 1  33.33333
    

    data.table:

    setDT(data)[, .(n = .N, percent = .N / length(unique(data$id))), by = fruit]