Search code examples
rglm

using glm with large data set in R - memory exhausted


I have a large data set (>6 million rows and 12 columns) that I am trying to perform logistical regression on. The first column of the data frame is named Dep1 and has either 0 or 1 values. The other columns are names Var1,Var2,...,Var11 and are the independent variables I am interested in. Some of the columns in the data frame are of type factor while others are num. I am running glm with the following call:

mylogit <- glm(Dep1 ~ Var1 + Var2 + Var3 + Var4 + Var5 + Var6  + Var7 + Var8 + Var9 + Var10 + Var11,data=dataset,family=binomial())

When I call glm with all the variables I get the message:

Error: vector memory exhausted (limit reached?)

I am able to run glm with a smaller set of variables, such as only Var1 through Var4, but I would like to examine it with all the variables. Any suggestions for addressing this error?


Solution

  • I ended up following the steps here and that appears to have resolved my issue once I restarted R.