Search code examples
rdplyrrpostgresql

dplyr:: append to postgresql remote source if uid not present


Unless I'm missing something there is no reference in documentation to how to use functions such as db_write_table in practice. The only reference in package documentation is.

db_write_table(con, table, types, values, temporary = FALSE, ...)

types is not explained. None of dplyr's remote connectivity functionality seems to be documented in much detail anywhere, e.g. dplyr.tidyverse.org.

I have a database mydb with table mydata I can connect

require(RPostgreSQL)
require(dplyr)

drv <- dbDriver("PostgreSQL")
sapply(dbListConnections(drv), dbDisconnect)
con <- dbConnect(drv, dbname = "mydb", host = "localhost", port = 5432, user = "username")
data_db = tbl(con, 'mydata')

The remote data tbl data_db contains fields with one called uid. I have a data_frame dat with observations I need to append to mydata if their dat$uid values are not already present.

Is this something that can be done with db_write_table or does it require a SQL approach?


Solution

  • Per @hadley's comment to this answer, you can use dbWriteTable(append = TRUE). Here's an example with SQLite:

    library(dplyr)
    
    # Create some example data
    con <- DBI::dbConnect(RSQLite::SQLite(), path = ":dbname:")
    dd <- data.frame(uid = 1:20, value = LETTERS[1:20])
    dat <- data.frame(uid = 15:26, value = LETTERS[15:26])
    copy_to(con, dd, "my_data",
      temporary = FALSE, 
      indexes = list("uid", "value")
    )
    
    # Identify rows to append
    dd_db <- tbl(con, "my_data")
    dd_uid <- select(dd_db, uid) %>%
      collect() %>%
      unlist()
    dat_to_append <- filter(dat, !(uid %in% dd_uid))
    
    DBI::dbWriteTable(con, "my_data", dat_to_append, append = TRUE)