I'm trying to do some filtering on several dataframes from a list (with a character column that is date-like).
I know that all my df have n (here 10) entities, and then i try to spot long_df format where each row is a couple entity-date.
Then I'd like to apply on these long df a filtering function, finding the character column that is a date and keeping only one year.
I've read that filter_if is deprecated, so I've tried to do something with across syntax but it failed so far.
Any idea why it is not working ?
## a list of df
list_df <- list(A = data.frame(ID = letters[1:10],
Var1 = rnorm(10),
Var2 = rnorm(10),
Var3 = rnorm(10)),
B = data.frame(ID = rep(letters[1:10],3),
X = c(rep("01/01/2018", 10),
rep("01/01/2019", 10),
rep("01/01/2020", 10)),
Var1 = rnorm(30),
Var2 = rnorm(30)),
C = data.frame(ID = rep(letters[1:10],2),
D = c(rep("01/01/2018", 10),
rep("01/01/2019", 10)),
Var1 = rnorm(20),
Var2 = rnorm(20)))
## a custom function to find character column that are date (= B$X & C$D)
guessdate <- function(x) !all(is.na(as.Date(as.character(x),format="%d/%m/%Y")))
#test the function on one df
list_df[["B"]] %>% map(., guessdate)
## what i've tried so far
list_df %>% map_if(.p = ~ nrow(.x) > 10, # apply function only on dataframe with more than 10 rows
~ filter(across(where(map(., guessdate)), ~ str_detect(.x, "2018")))) ## filter the date-like column keeping only (2018)
## desired ouput
output <- list(A = data.frame(ID = letters[1:10],
Var1 = rnorm(10),
Var2 = rnorm(10),
Var3 = rnorm(10)),
B = data.frame(ID = letters[1:10],
X = c(rep("01/01/2018", 10)),
Var1 = rnorm(10),
Var2 = rnorm(10)),
C = data.frame(ID = letters[1:10],
D = c(rep("01/01/2018", 10)),
Var1 = rnorm(10),
Var2 = rnorm(10)))
If looks like you have an extra map
in there and are not passing the data.frame to filter. Try
list_df %>% map_if(.p = ~ nrow(.x) > 10,
~ filter(.x, if_all(where(guessdate), ~ str_detect(.x, "2018"))))
Using across()
in filter()
was deprecated in dplyr 1.0.8 so we use if_all()
here