Search code examples
rdataframemergedataset

Merge multiple files into one big data table. Column names do not match in the files


I have 50+ csv files in a folder on my computer that I would like merged into 1 giant data table. Below is an example of how 3 out of my 50 tables could look (one, two, and three) and how I would like my final table to look (together).

one <- data.frame("County" =  c("Autauga", "Barbour", "Bibb"), "AAAA" = c(1, 
1, 1), "BBBB" = c(2, 2, 2))

two <- data.frame("County" =  c("Cape May", "Mercer", "Bergen"), "BBBB" = 
c(1, 1, 1), "CCCC" = c(2, 2, 2), "DDDD" = c(1, 2 ,3))

three <- data.frame("County" = c("Lincoln", "Jackson", "Pike"), "CCCC" = 
c(1, 1, 1))

together <- data.frame("County" = c("Autauga", "Barbour", "Bibb", "Cape 
May", "Mercer", "Bergen", "Lincoln", "Jackson", "Pike"), "AAAA" = c(1, 1, 1, 
NA, NA, NA, NA, NA, NA), "BBBB" = c(2, 2, 2, 1, 1, 1, NA, NA, NA), "CCCC" = 
c(NA, NA, NA, 2, 2, 2, 1, 1, 1), "DDDD" = c(NA, NA, NA, 1, 2, 3, NA, NA, 
NA))

If anyone could help me with this, that would be great! Also the blanks do not need to be "NA", they can just be left as blanks.


Solution

  • We can use bind_rows

    library(tidyverse)
    bind_rows(one, two, three)
    

    If there are many datasets, places it in a list and then use bind_rows/rbindlist from data.table

    Instead of creating multiple data.table/data.frame objects in the global env, read it into a list and then use rbindlist

    library(data.table)
    rbindlist(lapply(files, fread))