Search code examples
rdplyrlazy-evaluationcollect

Extract a dplyr tbl column as a vector


Is there a more succinct way to get one column of a dplyr tbl as a vector, from a tbl with database back-end (i.e. the data frame/table can't be subset directly)?

require(dplyr)
db <- src_sqlite(tempfile(), create = TRUE)
iris2 <- copy_to(db, iris)
iris2$Species
# NULL

That would have been too easy, so

collect(select(iris2, Species))[, 1]
# [1] "setosa"     "setosa"     "setosa"     "setosa"  etc.

But it seems a bit clumsy.


Solution

  • With dplyr >= 0.7.0, you can use pull() to get a vector from a tbl.


    library(dplyr, warn.conflicts = FALSE)
    db <- src_sqlite(tempfile(), create = TRUE)
    iris2 <- copy_to(db, iris)
    vec <- pull(iris2, Species)
    head(vec)
    #> [1] "setosa" "setosa" "setosa" "setosa" "setosa" "setosa"