I have 3 people in a study. They are asked to pick up as many items of fruit as they wish. I then want to count the times each fruit was picked up and create a proportion using the number of participants as the denominator.
I have a participant id and the name of the fruit arranged in a table as below:
id<-c("a","a","a","b","b","c","c","c","c")
fruit<-c("apple","pear","orange","apple","grapes","apple","pear","orange","grapefruit")
data<-data.frame(id,fruit, stringsAsFactors = FALSE)
#I would normally approach the problem using the tabyl function
janitor::tabyl(data,fruit)
The percentage includes all the values in the denominator which isnt what I want. It says 33% of people chose the apple, whereas the % I need is that 100% of the people chose the apple, 33% chose the grapes etc.
Can anyone suggest any code that calculates the percentage of each fruit using the number of participants as the denominator?
dplyr
:
library(dplyr)
data |>
summarise(
n = n(),
percent = n() / n_distinct(data$id) * 100,
.by = fruit
)
# fruit n percent
# 1 apple 3 100.00000
# 2 pear 2 66.66667
# 3 orange 2 66.66667
# 4 grapes 1 33.33333
# 5 grapefruit 1 33.33333
data.table
:
setDT(data)[, .(n = .N, percent = .N / length(unique(data$id))), by = fruit]