Search code examples
rbioinformaticslapplybiomart

Issue with lapply using biomart


I am trying to use lapply to change the species name when extracting all the human genes.

I'm still learning how to use lapply, I cant work out what I'm doing wrong.

So far I have:

library(biomaRt)

I create the marts:

ensembl_hsapiens <- useMart("ensembl", 
                        dataset = "hsapiens_gene_ensembl")
ensembl_mmusculus <- useMart("ensembl", 
                     dataset = "mmusculus_gene_ensembl")
ensembl_ggallus <- useMart("ensembl",
                       dataset = "ggallus_gene_ensembl")

Set the species:

species <- c("hsapiens", "mmusculus", "ggallus")

I then try to use lapply:

species_genes <- lapply(species, function(s) getBM(attributes = c("ensembl_gene_id", 
                                                  "external_gene_name"), 
                                   filters = "biotype", 
                                   values = "protein_coding", 
                                   mart = paste0(s, "_ensembl")))))

It gives me an error message saying:

Error in martCheck(mart) : You must provide a valid Mart object. To create a Mart object use the function: useMart. Check ?useMart for more information.


Solution

  • this should do the trick:

    species_genes <- lapply(species, function(s) getBM(attributes = c("ensembl_gene_id", 
                                                                      "external_gene_name"), 
                                                       filters = "biotype", 
                                                       values = "protein_coding", 
                                                       mart = get(paste0("ensembl_", s))))
    

    Explanation:

    the mart argument in getBM functions expects an object of class Mart and not a string

    class(ensembl_ggallus)
    #output
    [1] "Mart"
    attr(,"package")
    [1] "biomaRt"
    

    by using

    paste0("ensembl_", s)
    

    you get a string such as:

    "ensembl_hsapiens"
    

    the base function get searches for an object in the environment by name.

    get("ensembl_hsapiens") 
    
    identical(get("ensembl_hsapiens"), ensembl_hsapiens)
    #output
    TRUE