I want to have a tibble that contains several nearly identical tibbles. In each nested tibble, one variable's contents are shuffled, the others are unchanged. I want to have N (e.g. 10) versions of each shuffled tibble.
library(tidyverse)
df <- tibble(a = 1:5, b = 6:10, c = 11:15)
permutation_n <- 10
tbl_names <- names(df)
shuffle_var <- function(data, var) {
data %>%
mutate({{var}} := sample({{var}}))
}
> shuffle_var(df, a)
# A tibble: 5 × 3
a b c
<int> <int> <int>
1 4 6 11
2 2 7 12
3 1 8 13
4 3 9 14
5 5 10 15
shuffle_var()
works correctly if I use it separately.
But if I use it inside map()
, it will return a tibble with the original variable unchanged, and a new variable called .x
result <-
crossing(variable = tbl_names,
index = 1:permutation_n) %>%
mutate(data = map(variable, ~shuffle_var(df, .x)),
)
result %>% slice(1) %>% pull(data) %>% .[[1]]
# A tibble: 5 × 4
a b c .x
<int> <int> <int> <chr>
1 1 6 11 a
2 2 7 12 a
3 3 8 13 a
4 4 9 14 a
5 5 10 15 a
result
should contain 30 rows (variables x versions).
Each nested tibble should have 10 different versions, where the values of one variable are shuffled).
I have found several similar examples in SO but none of them helped in this specific issue.
Try it with .data
pronoun inside the function:
.data
is the data that is iterated by map()
..data[[var]]
inside the function to refer to the wanted column (var
), and we can use map()
to iterate of each nested tibble.shuffle_var <- function(data, var) {
data %>%
mutate({{var}} := sample(.data[[var]]))
}
result <-
crossing(variable = tbl_names,
index = 1:permutation_n) %>%
mutate(data = map(variable, ~shuffle_var(df, .x))
)
result %>%
slice(1) %>% pull(data) %>% .[[1]]
a b c .x
<int> <int> <int> <int>
1 1 6 11 1
2 2 7 12 5
3 3 8 13 3
4 4 9 14 4
5 5 10 15 2