Search code examples
rdata-modelingtidyselect

Unable to add primary keys to existing data model object


I'm not being able to add primary keys to an existing dm object using dm_add_pk. The error seems to appear only when the values are used dynamically (in a for loop for example) OR when its values are themselves captured on a separate variable.

My question is: has anyone else ever seen such a behavior? I mean R behaving differently when using a simple value than having the exact same value but captured in a variable. Maybe this is by design, but I'm clueless.

Minimal (not) working example:
(Note: conn is a connector to a PG database created with DBI::dbConnect and RPostgres.)

import::from(dm, new_dm, copy_dm_to, dm_add_pk)
import::from(purrr, map, set_names)

DATA_MODELS <- list(
    "TableName" = list( 
        "cols" = data.frame(
            run_id=character(),
            col_1=character(),
            col_2=numeric()
        ),
        "keys" = c("run_id", "col_2")
    )
)

tables_named_list <- map(DATA_MODELS, ~ set_names(.x, names(.x)))
tables_to_migrate <- list()
tablenames <- list()
for (t in names(tables_named_list)) {
    tables_to_migrate <- append(tables_to_migrate, list(tables_named_list[[t]][["cols"]]))
    tablenames <- append(tablenames, t)
}
names(tables_to_migrate) <- tablenames
data_model <- new_dm(tables_to_migrate)
for (table in names(tables_to_migrate)) {
    # The first line here fails. The 2nd one runs but seems to fail silently (no keys created)
    dm_add_pk(data_model, table=!!table, columns=c(DATA_MODELS[[table]][["keys"]]))
    # dm_add_pk(data_model, table=!!table, columns=c("run_id", "col_2"))
}

# First commented line below also fails. The second one works as expected
# dm_add_pk(data_model, table=!!table, columns=c(DATA_MODELS[[table]][["keys"]]))
# dm_add_pk(data_model, table=!!table, columns=c("run_id", "col_2"))

I tried comparing these values in a number of ways, always getting the same result: they are the exact same.

c("run_id", "col_2") == c(DATA_MODELS[[table]][["keys"]])
typeof(c("run_id", "col_2")) == typeof(c(DATA_MODELS[[table]][["keys"]]))
class(c("run_id", "col_2")) == class(c(DATA_MODELS[[table]][["keys"]]))
compare(c("run_id", "col_2"), c(DATA_MODELS[[table]][["keys"]]))

Additional info:

  • dm: 1.0.3
  • R: 4.2.2
  • DB: PostgreSQL 14.6 (Debian 14.6-1.pgdg110+1)
  • DBI: Version: 1.1.3, Source: Repository, Repository: CRAN, Hash: b2866e62bab9378c3cc9476a1954226b
  • OS: Linux 5.15.0-58-generic #64-Ubuntu SMP Thu Jan 5 11:43:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Solution

  • The solution was to call the keys argument using !! just like I did for the table parameter (I don't know why this works, but it does), like so

    dm_add_pk(data_model, table=!!table, columns=!!c(DATA_MODELS[[table]][["keys"]]))