When I run the code below, on the second iteration of the loop the whole OS hangs. If I open the task manager, it cleary shows that there's a huge memory leak. After I start the code execution, all the memory is gone in 4 seconds.
Here's the code:
void matrix_vector_multiplication_comparison()
{
for (unsigned DIMS_SIZE = 64; DIMS_SIZE <= 2048; DIMS_SIZE += 64)
{
__declspec(align(16))float* m1 = generate_random_1d_matrix(DIMS_SIZE * DIMS_SIZE);
__declspec(align(16))float* m2 = generate_random_1d_matrix(DIMS_SIZE * DIMS_SIZE);
__declspec(align(16))float* v1 = generate_random_1d_matrix(DIMS_SIZE);
__declspec(align(32))float* v2 = generate_random_1d_matrix(DIMS_SIZE);
__declspec(align(16))float* res1 = new float[DIMS_SIZE];
__declspec(align(16))float* res2 = new float[DIMS_SIZE];
__declspec(align(32))float* res3 = new float[DIMS_SIZE];
// ........ other stuff here...........
delete[] m1;
delete[] m2;
delete[] v1;
delete[] v2;
delete[] res1;
delete[] res2;
delete[] res3;
}
}
When I comment out everything in my code and leave only __declspec(align())
declarations and delete[]
's inside my for
loop, the memory leak is still there and it shows that the problem is actually with those __declspec
s.
The functions generate_random_1d_matrix
, get_random_float
and main
look like this:
float* generate_random_1d_matrix(unsigned const int dims)
{
size_t i;
float* result = new float[dims * dims];
for (i = 0; i < dims * dims; ++i)
result[i] = get_random_float(10, 100);
return result;
}
inline float get_random_float(float min, float max)
{
float f = (float)rand() / RAND_MAX;
return min + f * (max - min);
}
int main()
{
matrix_vector_multiplication_comparison();
return 0;
}
Could anybody tell me what's going wrong here and how to solve that memory problem?
changed the code provided. I left only the parts that actually produce the problem.
Try lowering 2048 to a more reasonable number. As it is you are trying to allocate millions of floats in large blocks, which doesn't seem reasonable. (It might actually be 10s of millions)
Even at just 128, you are trying to allocate 128^4*2 floats, which is over 200 million. I low balled a little in my previous explanation. even 64 is probably approaching too high.
I'm almost positive the problem is that in generate_random_1d_matrix when you use dims*dims you should be just using dims. Its a 1d matrix after all.