I'm working in R (and tidyverse) with data from a questionnaire comprised of 11 questions, each answered on a 4-point likert scale:
The data is in a data frame with participants as rows, and responses to each question stored as an ordered factor in individual columns.
The following code replicates 5 rows of the data as it is currently stored:
library(tidyverse)
df <- tibble(id = c(1, 2, 3, 4, 5), q1 = c(3, 4, 2, 3, 3),
q2 = c(4, 4, 2, 3, 2), q3 = c(3, 3, 2, 2, 3),
q4 = c(2, 2, 3, 2, 1), q5 = c(3, 3, 3, 3, 3),
q6 = c(4, 3, 2, 2, 2), q7 = c(1, 2, 2, 2, 2),
q8 = c(3, 3, 3, 2, 1), q9 = c(3, 4, 4, 2, 1),
q10 = c(2, 4, 3, 2, 1), q11 = c(2, 3, 2, 2, 1)) %>%
mutate(across(q1:q11, ~factor(.x,
levels = c(1, 2, 3, 4),
labels = c("Less than usual",
"No more than usual",
"More than usual",
"Much more than usual"),
ordered = TRUE)))
str(df)
# tibble [5 × 12] (S3: tbl_df/tbl/data.frame)
# $ id : num [1:5] 1 2 3 4 5
# $ q1 : Ord.factor w/ 4 levels "Less than usual"<..: 3 4 2 3 3
# $ q2 : Ord.factor w/ 4 levels "Less than usual"<..: 4 4 2 3 2
# $ q3 : Ord.factor w/ 4 levels "Less than usual"<..: 3 3 2 2 3
# $ q4 : Ord.factor w/ 4 levels "Less than usual"<..: 2 2 3 2 1
# $ q5 : Ord.factor w/ 4 levels "Less than usual"<..: 3 3 3 3 3
# $ q6 : Ord.factor w/ 4 levels "Less than usual"<..: 4 3 2 2 2
# $ q7 : Ord.factor w/ 4 levels "Less than usual"<..: 1 2 2 2 2
# $ q8 : Ord.factor w/ 4 levels "Less than usual"<..: 3 3 3 2 1
# $ q9 : Ord.factor w/ 4 levels "Less than usual"<..: 3 4 4 2 1
# $ q10: Ord.factor w/ 4 levels "Less than usual"<..: 2 4 3 2 1
# $ q11: Ord.factor w/ 4 levels "Less than usual"<..: 2 3 2 2 1
I need to calculate totals using two different scoring systems for the whole questionnaire as well as two subscales of select questions. First subscale is comprised of question 1–7, and second subscale question 8–11.
How can I calculate these totals using the two scoring systems to get the 6 (sub)totals: total_likert
, total_binary
, total_ss1_likert
, total_ss1_binary
, total_ss2_likert
and total_ss2_binary
?
You can first update your values based on the scoring systems with across
and recode
(you might like to choose replace
as well) and next calculate the sum scores for each id using rowwise
:
df %>%
mutate(across(starts_with("q"), ~ recode(.x, "Less than usual" = 0,
"No more than usual" = 1,
"More than usual" = 2,
"Much more than usual" = 3),
.names = "likert_{.col}")) %>%
mutate(across(starts_with("q"), ~ recode(.x, "Less than usual" = 0,
"No more than usual" = 0,
"More than usual" = 1,
"Much more than usual" = 1),
.names = "binary_{.col}")) %>%
rowwise(id) %>% mutate(total_likert = sum(c_across(likert_q1:likert_q11)),
total_ss1_likert = sum(c_across(likert_q1:likert_q7)),
total_ss2_likert = sum(c_across(likert_q8:likert_q11)),
total_binary = sum(c_across(binary_q1:binary_q11)),
total_ss1_binary = sum(c_across(binary_q1:binary_q7)),
total_ss2_binary = sum(c_across(binary_q8:binary_q11))) %>%
select(id, total_likert, total_binary, total_ss1_likert, total_ss1_binary, total_ss2_likert, total_ss2_binary)