I have a list of IDs that are not associated with any actual data. I have a SQL database that has a table for each of these IDs, and those tables have data that I would like to combine together into one large data frame based on the list of IDs that I have. I figured a for
loop would be needed for this, but I haven't been able to get it to work properly.
For example I have a list of IDs" 1,2,3,4,5
I have a SQL database with tables for each of these, and they also have other data associated with the IDs. Each ID has multiple rows and columns.
I would like my end product to be the combination of those rows and columns for the list of IDs to be in a single data frame in r. How could I do this? What is the most efficient way to do so?
#Example data set
library(lubridate)
date <- rep_len(seq(dmy("26-12-2010"), dmy("20-12-2011"), by = "days"), 500)
ID <- rep(seq(1, 5), 100)
df <- data.frame(date = date,
x = runif(length(date), min = 60000, max = 80000),
y = runif(length(date), min = 800000, max = 900000),
ID)
for (i in 1: length(ID)){
ID[i] <- dbReadTable(mydb, ID[i])
}
Thank you so much for your time.
I'll expand on my comment to finish the question.
IDs <- lapply(setNames(nm=ID), function(i) dbReadTable(mydb, i))
and then one of:
## base R
IDs <- Map(function(x, nm) transform(x, id = nm), IDs, names(IDs))
DF <- do.call(rbind, IDs)
## dplyr
DF <- dplyr::bind_rows(IDs, .id = "id")
## data.table
DF <- data.table::rbindlist(IDs, idcol = "id")
The addition of the "id"
column is to easily differentiate the rows based on the source ID
. If the table already includes that, then you can omit the Map
(base) and .id
/idcol
arguments.
(This assumes, btw, that all tables have the same exact structure: same column names and same data types.)