When I run the code shown below, I get a java.lang.OutOfMemoryError: GC overhead limit exceeded on line 16: svm_node node = new svm_node();
. The code is run on an array of ~1 million elements, where each element holds 100 shorts.
// read in a problem (in svmlight format)
private void read(SupportVector[] vectors) throws IOException
{
int length = vectors.length; // Length of training data
double[] classification = new double[length]; // This is redundant for our one-class SVM.
svm_node[][] trainingSet = new svm_node[length][]; // The training set.
for(int i = 0; i < length; i++)
{
classification[i] = 1; // Since classifications are redundant in our setup, they all belong to the same class, 1.
// each vector. The vector has to be one index longer than the actual vector,
// because the implementation needs an empty node in the end with index -1.
svm_node[] vector = new svm_node[vectors[i].getLength() + 1];
double[] doubles = vectors[i].toDouble(); // The SVM runs on doubles.
for(int j = 0; j < doubles.length; j++) {
svm_node node = new svm_node();
node.index = j;
node.value = doubles[j];
vector[j] = node;
}
svm_node last = new svm_node();
last.index = -1;
vector[vector.length - 1] = last;
trainingSet[i] = vector;
}
svm_problem problem = new svm_problem();
problem.l = length;
problem.y = classification;
problem.x = trainingSet;
}
From the exception, I guess the garbage collector cannot properly sweep up my new svm_nodes, but I am unable to see how I can optimize my object creation, to avoid creating too many new svn_nodes, that sits helpless in the heap.
I cannot change the data structure, as it is what LIBSVM uses as input to its support vector machine.
My question is: Is this error related to the garbage collector not being able to collect my svm_nodes, or am I simply trying to parse a data structure with too many elements?
PS: I already set the heap size to the maximum for my 32bit application (2gb).
I launched the application in a 64bit environment and raised the heap to more than 2gb which solved the problem. I still believe there's a weird GC quirk, but I was unable to find it, and increasing the heap also solved the problem.