Search code examples
rloopspurrrdata-cleaning

How do I edit data frames in a list.file


I have created a list of excel files with the following instructions:

setwd(folder1/folder2)

bases <- list.files(pattern = "*.xlsx")

bases.list <- lapply(bases, read_excel)

The names of the data frames are like these: jan20student.xlsx, feb20student.xlsx, ......, jan21student.xlsx. The number indicates the year 2020 or 2021. All data frames have the same variables. I want to keep specific variables and create a categorical variable of age ( [10, 20) = young and [20, 40) = adult) in each data frame. I would also like to save each data frame with a name like jan20names.xlsx, etc. Could you give me some suggestions for coding the instructions?

I tried with the following code:

result.list <- map_dfr(bases.list, select(id, exp, name, age))

But it does not work

Thanks in advance.


Solution

  • You can try this code -

    library(dplyr)
    
    bases <- list.files(pattern = "*.xlsx")
    
    purrr::map(bases, ~.x %>% 
                read_excel %>%
                select(id, exp, name, age) %>%
                mutate(age_cat = case_when(between(age, 10, 20) ~ 'young', 
                                           between(age, 20, 40) ~ 'adult')) %>%
                writexl::write_xlsx(sub('student', 'names', .x)))