I've created a tdm matrix in R which I want to write to a file. This is a large sparse matrix in simple triplet form, ~ 20,000 x 10,000. When I convert it to a dense matrix to add columns by cbind, I get low memory errors and the process does not complete. I don't want to increase my RAM.
Also, I want to - - bind the tf and tfidf matrices together - save the sparse/dense matrix to csv - run batch machine learning algorithms such as J48 implementation of weka.
How do I save/ load dataset and run the batch ML algorithms within memory constraints?
If I can write a sparse matrix to a data store, can I run ml algorithms in R on a sparse matrix, and within memory constraints?
There could be several solutions:
1) Convert your matrix from double to integer, if you are dealing with integer numbers. Integers needs less memory comparing to double numbers.
2) Try the bigmemory package.