I have a list of .csv files that I am trying to filter one by one (I can't filter when regrouped because I have too much data to load it at the same time).
I want :
Here is a (fake) example of my data :
library(tidyverse)
df_list=data.frame(a=seq(1,20,1), b=seq(41,60,1), c=seq(81,100,1)) %>% map(~{
data.frame( a=.x, b=.x*2, c=.x*3)})
I then managed to do :
regrouped_data=df_list %>% map(~{
# Filter
d2=.x %>% filter(a>5)
# Count
print(
tribble(~date,~initial,~final,
"name",nrow(.x),nrow(d2)
)
)
return(d2)
}) %>% bind_rows()
The problem is : I need all the data.table to be assembled in one (because I have a lot of files to filter). How can I do that ?
Can be nice to lay everything out so the logic is clear in a straightforward loop:
filterCount <- function(){
for(i in 1:length(df_list)){
data_flt <- df_list[[i]] %>%
filter(a>5)
count_flt <- tibble(date = i,
nrow.total = nrow(df_list[[i]]),
nrow.flt = nrow(data_flt))
if(i == 1){
data_out <- data_flt
count_out <- count_flt
} else {
data_out <- bind_rows(data_out, data_flt)
count_out <- bind_rows(count_out, count_flt)
}
}
return(list(data_out, count_out))
}