I'm trying to run a 2-way repeated measures ANOVA to look at condition and time effect on systolic BP. I'm using the anova_test()
function but I'm getting the error:
Error in `spread()`:
! Each row of output must be identified by a unique combination of keys.
ℹ Keys are shared for 6 rows
• 14, 36
• 182, 204
• 98, 120
I'm unsure of why these are reading as non-unique?
> df_lbp[c(14,36),]
# A tibble: 2 × 10
subject_id condition visit time syst timef conditionf
<dbl> <dbl> <dbl> <int> <dbl> <fct> <fct>
1 129 0 1 1 106. anticipate Control
2 165 1 1 1 119 anticipate Stress
> df_lbp[c(182, 204),]
# A tibble: 2 × 10
subject_id condition visit time syst timef conditionf
<dbl> <dbl> <dbl> <int> <dbl> <fct> <fct>
1 129 0 1 3 103. recovery Control
2 165 1 1 3 121. recovery Stress
> df_lbp[c(98, 120),]
# A tibble: 2 × 10
subject_id condition visit time syst timef conditionf
<dbl> <dbl> <dbl> <int> <dbl> <fct> <fct>
1 129 0 1 2 102. task Control
2 165 1 1 2 128 task Stress
I'm curious what r is pulling from to use as keys, and I'd appreciate any help in getting this to work. My code and data are below.
a1 <- anova_test( data = df_lbp, dv = syst,
wid = subject_id,
within = c(timef, conditionf) )
get_anova_table(a1)
dput(df_lbp))
You observation with subject_id
161 has several entries (varying timef
values):
library(dplyr)
df_lbp |>
count(subject_id, timef, conditionf) |>
filter(n > 1)
output:
# A tibble: 3 x 4
subject_id timef conditionf n
<dbl> <fct> <fct> <int>
1 161 anticipate Stress 2
2 161 task Stress 2
3 161 recovery Stress 2
... without these duplicates, anova_test
runs OK:
df_lbp |>
filter(subject_id != 161) |>
rstatix::anova_test(dv = syst,
wid = subject_id,
within = c(timef, conditionf)
)
output:
+ ANOVA Table (type III tests)
$ANOVA
Effect DFn DFd F p p<.05 ges
1 timef 2 46 38.931 1.28e-10 * 0.076
2 conditionf 1 23 28.937 1.83e-05 * 0.284
3 timef:conditionf 2 46 37.078 2.57e-10 * 0.087
## etc.
edit
as r2evans pointed out, you can keep distinct
combinations of variables (instead of checking first and singling them out) like so (note that the first/topmost observation of any duplicate is kept):
df_lbp |>
distinct(subject_id, timef, conditionf,
.keep_all = TRUE
) |>
rstatix::anova_test(dv = syst,
wid = subject_id,
within = c(timef, conditionf)
)