I have a data frame (named as df) of 144 columns (trial numbers) containing the information about the trial success (Yes/No) per participant (the rows). A subset would look like this:
V1 V2 V3 V4 V5
Yes No Yes Yes No
Yes No No No No
Yes Yes Yes Yes No
I want to count the occurrences of Yes and No outcomes per participant across 144 trials. However, I also want to subset specific trial numbers (take V1, V4, V5, V110, V112, etc.) and count the outcomes accordingly. If I write a code as:
Yes <- rowSums(df == "Yes") # Count the "No" per row
cbind(Yes, No = ncol(df) - Yes) # Subscribe these from the columns numbers and combine
# Yes No
# [1,] 3 2
# [2,] 1 4
# [3,] 4 1
This gives me the counts of Yes and No's per participant, but across all trials. How can I specify certain columns (trials) and count per participant?
You can subset df using [
while comparing. Here columns 1, 4, and 5 are selected.
rowSums(df[,c(1,4,5)] == "Yes") #For column 1, 4 and 5
#[1] 2 1 2
To calculate the percentage of Yes (asked in the comments), rowMeans
could be used:
100 * rowMeans(df == "Yes")
#[1] 60 20 80