Search code examples
rbioinformaticsdbplyrbiomart

Error: ! Failed to collect lazy table. Caused by error in `db_collect()` - using biomaRt package in R


I'm currently working on a bioinformatics project using R, and I'm encountering an error when trying to use the biomaRt package. After installing the package and loading it into R, I tried to select a biomaRt database to use in my analysis.

Here's the code I ran when I received an error:

library(biomaRt)
ensembl <- useEnsembl(biomart = "ensembl", dataset = "hsapiens_gene_ensembl")

The error message:

Error in `collect()`:
! Failed to collect lazy table.
Caused by error in `db_collect()`:
! Arguments in `...` must be used.
✖ Problematic argument:
• ..1 = Inf
ℹ Did you misspell an argument name?

Backtrace:
     ▆
  1. ├─biomaRt::useEnsembl(biomart = "genes", dataset = "hsapiens_gene_ensembl")
  2. │ └─biomaRt:::.getEnsemblSSL()
  3. │   └─BiocFileCache::BiocFileCache(cache, ask = FALSE)
  4. │     └─BiocFileCache:::.sql_create_db(bfc)
  5. │       └─BiocFileCache:::.sql_validate_version(bfc)
  6. │         └─BiocFileCache:::.sql_schema_version(bfc)
  7. │           ├─base::tryCatch(...)
  8. │           │ └─base (local) tryCatchList(expr, classes, parentenv, handlers)
  9. │           └─tbl(src, "metadata") %>% collect(Inf)
 10. ├─dplyr::collect(., Inf)
 11. └─dbplyr:::collect.tbl_sql(., Inf)
 12.   ├─base::tryCatch(...)
 13.   │ └─base (local) tryCatchList(expr, classes, parentenv, handlers)
 14.   │   └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
 15.   │     └─base (local) doTryCatch(return(expr), name, parentenv, handler)
 16.   └─dbplyr::db_collect(x$src$con, sql, n = n, warn_incomplete = warn_incomplete, ...)`

R Version: 4.3.1 (2023-06-16 ucrt)
BiomaRt Version: 2.58.0
Operating System: Windows

I tried updating all the packages (biomaRt and dbplyr) and restarting R, but nothing helped.

I would greatly appreciate any guidance or insights on how to resolve this error. Thank you in advance for your help!


Solution

  • This is probably due to the dbplyr upgrade, there is already a merged PR in BiocFileCache to solve this.

    Somehow the Inf argument in tbl(src, "metadata") %>% collect(Inf) got into the ... of dbplyr::db_collect <- function(con, sql, n = -1, warn_incomplete = TRUE, ...) instead of the n argument.

    While BiocFileCache 2.10.1 is waiting to be built on the Bioconductor servers, downgrading dbplyr solves this issue for me (devtools::install_version("dbplyr", version = "2.3.4")).

    I suppose installing the latest BiocFileCache from their Github repo would work as well.