suppose I have a list of dataframes as follows:
df1 <- data.frame(a1 = 1:5, a2 = 1:5, a3 = 1:5)
df2 <- data.frame(a1 = 1:3, a2 = 2:4, a3 = 3:5)
df3 <- data.frame(a1 = 10:20, a2 = 5:15)
l <- list(df1 = df1, df2 = df2, df3 = df3)
What should I do to perform operations (like mutate
) on each element on the list conditioning on the elements name?
For instance - how would I proceed If I wanted to add some new column only if was dealing with df1
or df3
and wanted to delete some column if I was dealing with df2
?
Could map_if
deal with that?
PS: Keep in mind that the list would probably have more than 3 datasets so that possibly multiple conditions would be needed.
You can do this sort of operations with imap
instead. Since you would like to do a certain operations based on names of the your list or names of the elements of the list you should use imap
.
.f
argument in imap
takes 2 arguments:
.x
which is the first argument and represents the value.y
which is the second argument and represents the names of you arguments and in case they don't have names, it represents their positionsSo for example in this case .x
s are your 3 data sets and .y
s are their names df1:df3
or their positions 1:3
.
library(purrr)
l %>%
imap(~ if(.y %in% c("df1", "df3")) {
.x %>%
mutate(a3 = a1 + a2)
} else {
.x <- .x[-3]
.x
})
$df1
a1 a2 a3
1 1 1 2
2 2 2 4
3 3 3 6
4 4 4 8
5 5 5 10
$df2
a1 a2
1 1 2
2 2 3
3 3 4
$df3
a1 a2 a3
1 10 5 15
2 11 6 17
3 12 7 19
4 13 8 21
5 14 9 23
6 15 10 25
7 16 11 27
8 17 12 29
9 18 13 31
10 19 14 33
11 20 15 35
But if you would like to apply a certain function on each of your elements that meets a certain condition then you can use map_if
. For example we would like to add a4
column if the number of rows in each are greater than a certain number. Bear in mind that .p
argument should return a single TRUE
or FALSE
:
# This use case works
l %>%
map_if(~ nrow(.x) > 3, ~ .x %>%
mutate(a4 = a1 + a2))
# But this doesn't becase names(.x) are actually column names of each element and the result is not what you are after
l %>%
map_if(~ names(.x) %in% c("df1", "df3"), ~ .x %>%
mutate(a4 = a1 + a2))
An equivalent to imap
is map2
where the second argument is the names of each element (and not the column names of each element):
l %>%
map2(names(l), ~ if(.y %in% c("df1", "df3")) {
.x %>%
mutate(a3 = a1 + a2)
} else {
.x <- .x[-3]
.x
})