Search code examples
c++11rcpparmadillo

Error message: Cube::init(): requested size is too large; suggest to enable ARMA_64BIT_WORD using sommer package for GWAS


Output from sessionInfo()

R version 4.0.5 (2021-03-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] sommer_4.1.4      crayon_1.4.1      lattice_0.20-41   MASS_7.3-53.1     Matrix_1.3-2      data.table_1.14.0

loaded via a namespace (and not attached):
[1] compiler_4.0.5  tools_4.0.5     rstudioapi_0.13 Rcpp_1.0.6      grid_4.0.5 

I have been trying to carry out a gwas using the sommer package with the following code:

var_cov <- A.mat(m_matrix) ## aditive relationship matrix

model <- GWAS(cbind(DW20, PLA07, PLA08, PLA09, PLA10, PLA11, PLA12, PLA13, PLA14, PLA15, PLA16, PLA17, PLA18, RGR07_09, RGR08_10, RGR09_11, RGR10_12, RGR11_13, RGR12_14, RGR13_15, RGR14_16, RGR15_17, RGR16_18, SA, SL, SW) ~ 1, random = ~ vs(accession, Gu = var_cov), data = pheno2, M = m_matrix, gTerm = "u:accession", n.PC = 5)

As described in the code, I have 26 traits and I would like to use the K+P model. My SNPs matrix has 211 260 markers and 309 accessions.

When I run this code for one and two traits, it works fine. But, when I try to run with all the 26 traits I get the error message:

Error in GWAS(cbind(DW20, PLA07, PLA08, PLA09, PLA10, PLA11, PLA12, PLA13,  : 
  Cube::init(): requested size is too large; suggest to enable ARMA_64BIT_WORD

I searched online and found that this error is related to the package RcppArmadillo. Following the suggestions here (http://arma.sourceforge.net/docs.html#config_hpp_arma_64bit_word) and here (Large Matrices in RcppArmadillo via the ARMA_64BIT_WORD define), I tried to enable the ARMA_64BIT_WORD by uncommenting the line #define ARMA_64BIT_WORD (bellow) in the file RcppArmadillo\include\armadillo_bits\config.hpp:

#if !defined(ARMA_64BIT_WORD)
//#define ARMA_64BIT_WORD 
//// Uncomment the above line if you require matrices/vectors capable of holding more than 4 billion elements.
//// Note that ARMA_64BIT_WORD is automatically enabled when std::size_t has 64 bits and ARMA_32BIT_WORD is not defined.
#endiff 

and also including the following line in the file Makevars.win in RcppArmadillo\skeleton.

PKG_CPPFLAGS = -DARMA_64BIT_WORD=1 

None of the suggestions worked and I continue getting the same error message. My questions are: is there another option to enable the ARMA_64BIT_WORD that I am missing? Is it possible to run the GWAS function in sommer package with as many traits as 26 or this number is too much? Would the error message result from a mistake in the GWAS code?

Thank you very much in advance.


Solution

  • My first take Ana is that you're trying to fit an unstructured multivariate model with 26 traits when you use cbind(), that means that if you have 1000 records, this will be a model of 309 x 26 = 8,034 records which would be a bit too big for the direct inversion algorithm that sommer uses, plus the number of parameters to estimate are a lot (think all the covariance parameters (26*25)/2 = 325. I would suggest fitting a GWAS per trait in a for loop to solve your issue. Unless you have a good justification to run a multivariate GWAS this is the issue with your analysis more than the C++ code behind. For example:

    var_cov <- A.mat(m_matrix) ## aditive relationship matrix
    traits <- c(DW20, PLA07, PLA08, PLA09, PLA10, PLA11, PLA12, PLA13, PLA14, PLA15, PLA16, PLA17, PLA18, RGR07_09, RGR08_10, RGR09_11, RGR10_12, RGR11_13, RGR12_14, RGR13_15, RGR14_16, RGR15_17, RGR16_18, SA, SL, SW)
    
    for(itrait in traits){
    
    model <- GWAS(as.formula(paste(itrait,"~1")), random = ~ vs(accession, Gu = var_cov), data = pheno2, M = m_matrix, gTerm = "u:accession", n.PC = 5)
    
    }
    

    If it turns out that even with a single trait the arma::cube function presents memory issues then definitely we need to look at why the armadillo library cannot deal with those dimensions.

    Cheers, Eduardo