Search code examples
rcsvreadr

Read all csv files in a directory and add the name of each file in a new column


I have this code that reads all CSV files in a directory.

nm <- list.files()

df <- do.call(rbind, lapply(nm, function(x) read_delim(x,';',col_names = T)))

I want to modify it in a way that appends the filename to the data. The result would be a single data frame that has all the CSV files, and inside the data frame, there is a column that specifies from which file the data came. How to do it?


Solution

  • Instead of do.call(rbind, lapply(...)), you can use purrr::map_dfr() with the .id argument:

    library(readr)
    library(purrr)
    
    df <- list.files() |>
      set_names() |>
      map_dfr(read_delim, .id = "file")
    
    df
    
    # A tibble: 9 × 3
      file    col1  col2
      <chr>  <dbl> <dbl>
    1 f1.csv     1     4
    2 f1.csv     2     5
    3 f1.csv     3     6
    4 f2.csv     1     4
    5 f2.csv     2     5
    6 f2.csv     3     6
    7 f3.csv     1     4
    8 f3.csv     2     5
    9 f3.csv     3     6
    

    Example data:

    for (f in c("f1.csv", "f2.csv", "f3.csv")) {
      readr::write_delim(data.frame(col1 = 1:3, col2 = 4:6), f, ";")
    }