Search code examples
rloopsmergereadxl

Is there a way in R to read multiple excel files, change columns to character, and then merge them?


New-ish to R and I feel like this has a simple solution, but I can't figure it out.

I have 59 excel files that I want to combine. However, 4 of the columns have a mix of dates and NA's (depending on if the study animal is a migrant or not) so R won't let me combine them because some are numeric and some are character. I was hoping to read all of the excel files into R, convert those 4 columns in each file to as.character, and then merge them all. I figured a loop could do this.

Anything I find online has me typing out the name for each read file, which I don't really want to do for 59 files. And once I do have them read into R and those columns converted, can I merge them from R easily? Sorry if this is simple, but I'm not sure what to do that would make this easier.


Solution

  • You can do this quickly using lapply. It was unclear exactly how you wanted to combine the files (a true merge by a common variable, or append the rows, or append the columns). Either way, I do not believe you need to change anything to as.character for any of the below approaches (2A - 2C) to work:

    library(readxl)
    
    # 1. Read in all excel files in a folder given a specific filepath
    filepath <- "your/file/path/"
    file_list <- list.files(path = filepath, pattern='*.xlsx')
    df_list <- lapply(file_list, read_excel)
    
    # 2a. Merge data (assuming a unique identifier, i.e "studyid")
    final_data <- Reduce(function(...) merge(..., by = "studyid", all = TRUE), df_list)
    
    # 2b. If all files have the same columns, append to one long dataset
    final_data <- do.call(rbind, df_list)
    
    # 2c. If you want to make a wide dataset (append all columns)
    final_data <- do.call(cbind, df_list)