I have a potentially unique problem where I would like to average some values in a given column by using the mutate()
function in dplyr
, but only between those rows that contain specific strings in another column.
For example, I have two subjects (Person 1
and Person 2
) and three different types of tests (test
column) they performed. Test 1
and Test 2
are the same test but performed on different limbs, so I would like to group by each subject (name
column) and average the values in the error
column, but only between rows that contain Test 1
and Test 2
. error
values for Test 3
I'd like left untouched for each subject. Hope that makes sense, and any help will be appreciated.
set.seed(1)
dat <- data.frame(
name = rep(paste("Person", 1:2), each = 30),
test = rep(paste("Test", 1:3), each = 10),
value = round(rnorm(60, 100, 20)),
error = round(runif(60, 2, 15), 1)
)
Below is the first 3 rows by group of the above data. For the output, the number of rows should be the same, but the error values based on the above conditions aggregated. For example, when looking at Person 1
rows 1 and 4 should be the average of rows 1 and 4 (first Test 1
observations), row 2
and 5 should be the average of rows 2 and 5 and so on. Test 3
is left untouched.
# A tibble: 18 × 4
# Groups: name, test [6]
name test value error
<chr> <chr> <dbl> <dbl>
1 Person 1 Test 1 87 14.9
2 Person 1 Test 1 104 8.4
3 Person 1 Test 1 83 8.3
4 Person 1 Test 2 130 9.5
5 Person 1 Test 2 108 3
6 Person 1 Test 2 88 2.5
7 Person 1 Test 3 118 10.9
8 Person 1 Test 3 116 9.8
9 Person 1 Test 3 101 5.1
10 Person 2 Test 1 127 10
11 Person 2 Test 1 98 9.2
12 Person 2 Test 1 108 6.3
13 Person 2 Test 2 97 5.7
14 Person 2 Test 2 95 13.6
15 Person 2 Test 2 114 7.8
16 Person 2 Test 3 108 10.2
17 Person 2 Test 3 88 12.9
18 Person 2 Test 3 107 13.1
Assuming you want to maintain the value and error columns e.g. keep all rows:
library(dplyr)
library(slider)
result <- dat |>
group_by(name, test) |>
mutate(
error_roll_mean = ifelse(
test %in% c("Test 1", "Test 2"),
case_when(
row_number() <= 3 ~ NA,
.default = slide_dbl(error, mean, .before = 3, .complete = TRUE)
),
error
)
) |>
ungroup()
data.frame(result)
# name test value error error_roll_mean
# 1 Person 1 Test 1 87 14.9 NA
# 2 Person 1 Test 1 104 8.4 NA
# 3 Person 1 Test 1 83 8.3 NA
# 4 Person 1 Test 1 132 4.3 8.975
# 5 Person 1 Test 1 107 11.8 8.200
# 6 Person 1 Test 1 84 7.9 8.075
# 7 Person 1 Test 1 110 8.6 8.150
# 8 Person 1 Test 1 115 4.7 8.250
# 9 Person 1 Test 1 112 5.0 6.550
# 10 Person 1 Test 1 94 9.7 7.000
# 11 Person 1 Test 2 130 9.5 NA
# 12 Person 1 Test 2 108 3.0 NA
# 13 Person 1 Test 2 88 2.5 NA
# 14 Person 1 Test 2 56 10.4 6.350
# 15 Person 1 Test 2 122 14.1 7.500
# 16 Person 1 Test 2 99 9.8 9.200
# 17 Person 1 Test 2 100 9.3 10.900
# 18 Person 1 Test 2 119 8.8 10.500
# 19 Person 1 Test 2 116 14.8 10.675
# 20 Person 1 Test 2 112 8.6 10.375
# 21 Person 1 Test 3 118 10.9 10.900
# 22 Person 1 Test 3 116 9.8 9.800
# 23 Person 1 Test 3 101 5.1 5.100
# 24 Person 1 Test 3 60 5.4 5.400
# 25 Person 1 Test 3 112 11.5 11.500
# 26 Person 1 Test 3 99 7.9 7.900
# 27 Person 1 Test 3 97 4.3 4.300
# 28 Person 1 Test 3 71 11.7 11.700
# 29 Person 1 Test 3 90 3.4 3.400
# 30 Person 1 Test 3 108 13.2 13.200
# 31 Person 2 Test 1 127 10.0 NA
# 32 Person 2 Test 1 98 9.2 NA
# 33 Person 2 Test 1 108 6.3 NA
# 34 Person 2 Test 1 99 7.9 8.350
# 35 Person 2 Test 1 72 8.5 7.975
# 36 Person 2 Test 1 92 4.4 6.775
# 37 Person 2 Test 1 92 8.9 7.425
# 38 Person 2 Test 1 99 3.0 6.200
# 39 Person 2 Test 1 122 5.6 5.475
# 40 Person 2 Test 1 115 4.8 5.575
# 41 Person 2 Test 2 97 5.7 NA
# 42 Person 2 Test 2 95 13.6 NA
# 43 Person 2 Test 2 114 7.8 NA
# 44 Person 2 Test 2 111 12.1 9.800
# 45 Person 2 Test 2 86 13.4 11.725
# 46 Person 2 Test 2 86 7.4 10.175
# 47 Person 2 Test 2 107 2.8 8.925
# 48 Person 2 Test 2 115 6.4 7.500
# 49 Person 2 Test 2 98 11.4 7.000
# 50 Person 2 Test 2 118 6.4 6.750
# 51 Person 2 Test 3 108 10.2 10.200
# 52 Person 2 Test 3 88 12.9 12.900
# 53 Person 2 Test 3 107 13.1 13.100
# 54 Person 2 Test 3 77 7.1 7.100
# 55 Person 2 Test 3 129 6.9 6.900
# 56 Person 2 Test 3 140 13.6 13.600
# 57 Person 2 Test 3 93 10.4 10.400
# 58 Person 2 Test 3 79 11.6 11.600
# 59 Person 2 Test 3 111 9.9 9.900
# 60 Person 2 Test 3 97 13.7 13.700