I am trying to reshape a survey dataset loaded into a dataframe with about 11k variables and 2k rows into a long(er) format, in order to do some analysis on variables that resulted from looped questions. I have not been able to figure out a way to get around memory allocation errors.
Am I hitting the practical size limit for using melt on dataframes (with about 28MB in CSV-format)? Is there a different way to use melt, or would you use a different function/library for this purpose?
I've tried using reshape2's melt function, which should be straightforward but gives a memory error immediately ("cannot allocate vector of size...").
Then I tried breaking up the looped variables into chunks, in order to get many smaller dataframes to melt and then re-constitute. That gives me similar errors (with smaller sizes that cannot be allocated).
For reference, my data has an identifier field ("SbjNum"), a number of variables that only occur once (about 1900), and 99 variables that occur 100 times each (with a prefix of "I_X_I_Y", where X and Y identify loops)--and should be molten into rows corresponding to unique X and Y.
Just using melt naively looked like this:
molten <- melt(data, id.vars = c("SbjNum"))
The chunking I've tried so far looks like this:
#all variable names produced by the loops
loops <- names(data)[grep("I_\\d{1,2}_I_\\d{1,2}",names(data))]
#setting number of desired chunks
nloopvars <- length(loops)
nchunks <- 100
#make nchunks indexers to subset my data
chunks <- lapply(#indices of loops split into nchunks groups
split(1:nloopvars, sort(1:nloopvars%%nchunks)),
function(v){loops[v]}
)
#melt little subsets of the data
molten <- lapply(chunks,
function(x){
# take only identifier and a subset of loop vars
df <- data[c("SbjNum", x)]
# melt the loop vars
return(melt(df, id.vars = "SbjNum"))
}
)
EDIT: after terminating and restarting R as well as clearing my workspace several different ways, approach #2 now works.
After terminating and restarting R, and clearing the workspace several times, my own "chunking" approach is now working (see question)--I recommend trying this in case anyone else has similar issues.
[There is still a question of up to what size melting makes sense, but I can live without knowing that answer for now.]