Search code examples
crecursionmemory-managementstaticstack-overflow

Prevent stack memory usage for recursive function in C


This C code does DBSCAN - Density-based spatial clustering of applications with noise. It's a cluster algorithm for turining unsupervised (no labels) data to become supervised (labels e.g class number) data. And it's also fast. The drawback with DBSCAN is that it consumes a lot of memory, but its speed is excellent!

So what I have done is that I have created a DBSCAN algorithm that using recursion. Once I was done with this, I notice that the recusion made a stack overflow if the row argument was large.

Questions:

  1. I want to make sure that recursion functions in C must be allocated on the heap, not the stack. How can that be done?
  2. Is there any way I could make sure this function recursion_clustering not increase any memory and only using static memory? I have tried to use so much pointers as possible to prevent memory duplication.

Code:

/* Private function */
static bool recursion_clustering(const int32_t i, const size_t* row, const float* epsilon, const size_t* min_pts, const float C[], bool visited[], size_t idx[], int32_t* id, int32_t* count);

 /*
  * Create ID's of clusters
  * X[m*n]
  * idx[m] = Contains cluster ID. Noise ID is 0
  * epsilon = Raduis of the clusters
  * min_pts = Minimum points of a valid cluster
  */
void dbscan(const float X[], size_t idx[], const float epsilon, const size_t min_pts, const size_t row, const size_t column) {
    /* Create idx */
    memset(idx, 0, row * sizeof(size_t));

    /* Create pdist2 C */
    float* C = (float*)malloc(row * row * sizeof(float));
    pdist2(X, X, C, row, column, row, PDIST2_METRIC_EUCLIDEAN);

    /* Flags */
    bool* visited = (bool*)malloc(row * sizeof(bool));
    memset(visited, false, row * sizeof(bool));
    
    /* Iterate */
    int32_t i, count, id = 1;
    for (i = 0; i < row; i++) {
        if (recursion_clustering(i, &row, &epsilon, &min_pts, C, visited, idx, &id, &count)) {
            /* Set ID value */
            idx[i] = id;
            id++;
        }
    }

    /* Free */
    free(C);
    free(visited);
}

static bool recursion_clustering(const int32_t i, const size_t* row, const float* epsilon, const size_t* min_pts, const float C[], bool visited[], size_t idx[], int32_t* id, int32_t* count) {
    /* Declare variables */
    int32_t j;

    /* Check if already visited - else mark as visited */
    if (visited[i]) {
        return false;
    }
    else {
        visited[i] = true;
    }

    /* Begin first with the row of C */
    *count = 0;
    for (j = 0; j < *row; j++) {
        /* Check how many index exceeds min_pts */
        if (C[i * (*row) + j] <= *epsilon) {
            (*count)++;
        }
    }

    /* Check if count is less than min_pts */
    if (*count < *min_pts) {
        return false;
    }

    /* Iterate forward */
    for (j = i + 1; j < *row; j++) {
        if (!visited[j] && C[i * (*row) + j] <= *epsilon) {
            /* Recursion */
            if (recursion_clustering(j, row, epsilon, min_pts, C, visited, idx, id, count)){
                /* Set ID value */
                idx[j] = *id;
            }
        }
    }

    /* Iterate backward */
    for (j = i - 1; j >= 0; j--) {
        if (!visited[j]) {
            /* Recursion */
            if (recursion_clustering(j, row, epsilon, min_pts, C, visited, idx, id, count)) {
                /* Set ID value */
                idx[j] = *id;
            }
        }
    }

    /* Return true */
    return true;
}

If you want to run this code, you need this working example.

#define row 65
#define column 3

int main() {
    /* Define X, idx, epsilon and min_pts */
    float X[row * column] = { -0.251521, 1.045117, -1.281658,
        -1.974109, 0.278170, -1.023392,
        -0.957729, -0.977450, 0.477872,
        -0.449159, -1.016680, 0.095610,
        -1.785787, -1.403543, 0.483454,
        1.366889, -0.762590, -1.162454,
        2.129839, 0.358568, -2.118250,
        0.751071, -1.766582, 0.178434,
        -1.980847, -1.320933, -0.457778,
        -0.478030, 0.606917, -1.630624,
        3.674916, 0.088957, 0.877373,
        0.637213, 0.079176, 0.342038,
        1.142329, 0.629997, 0.311134,
        -0.878974, 0.042527, 0.736522,
        1.751637, -1.434299, -1.325140,
        1.110682, 1.091970, 1.434869,
        -0.504482, -2.504821, -1.245315,
        -0.102915, -0.203266, -0.849767,
        -0.822834, 1.158801, -0.405579,
        -1.278287, 0.391306, 0.857077,

        /* Outliers */
         45, 43, 0,
         23, -3, 2,

        10.6772, 10.7365, 9.9264,
        8.7785, 11.1680, 9.5915,
        8.6872, 9.6464, 10.3801,
        10.0142, 8.8311, 9.2021,
        8.4179, 9.8572, 11.6356,
        9.8487, 10.4979, 10.8620,
        10.0957, 9.7878, 12.2653,
        11.4528, 11.5186, 10.3050,
        10.9284, 9.9654, 10.4562,
        8.5272, 10.7451, 9.8355,
        10.1508, 10.2318, 10.2417,
        10.7342, 10.0689, 9.9918,
        10.4784, 9.2032, 10.6060,
        10.1309, 9.4392, 10.9674,
        10.6971, 10.3347, 11.0447,
        7.9489, 9.4566, 9.5258,
        10.4827, 10.3030, 10.5582,
        10.4496, 10.3880, 11.1661,
        11.0291, 10.0233, 9.9280,
        9.0638, 9.3650, 9.3670,

        /* Outliers */
         32, 54, 23,
         23, 51, 77,


    -34.233,  -30.841,  -31.720,
     -32.629,  -31.786,  -31.290,
     -31.466,  -31.984,  -33.254,
     -31.878,  -33.052,  -31.761,
     -33.528,  -30.921,  -32.836,
     -31.793,  -32.082,  -30.453,
     -31.812,  -32.417,  -31.874,
     -32.127,  -32.599,  -32.806,
     -32.979,  -32.096,  -31.754,
     -31.759,  -31.925,  -31.313,
     -30.531,  -31.838,  -31.179,
     -32.168,  -31.928,  -30.649,
     -31.049,  -32.092,  -31.408,
     -33.006,  -31.753,  -31.961,
     -32.092,  -32.391,  -31.501,
     -31.184,  -31.634,  -32.802,
     -30.658,  -31.616,  -31.493,
     -31.958,  -31.694,  -31.425,
     -33.114,  -32.029,  -31.459,
     -31.081,  -34.486,  -32.020,

        /* Outlier */
        -22, -34, 53 };

    size_t idx[row] = { 0 };
    float epsilon = 10;
    size_t min_pts = 3;

    /* Compute dbscan */
    dbscan(X, idx, epsilon, min_pts, row, column);

    /* Print idx */
    size_t i;
    for (i = 0; i < row; i++) {
        printf("%i\n", idx[i]);
    }

    return 0;
}

The output becomes:

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
0
0
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
0

Solution

    1. I want to make sure that recursion functions in C must be allocated on the heap, not the stack. How can that be done?

    That's impossible. The C standard itself does in fact not mention where things are allocated, except that allocated storage is a special case only utilized when malloc (and friends) is used explicitly.

    Other than that, where parameters and return addresses are stored is determined by calling convention and ABI of your specific system. The vast majority of all computers out there use a combination of registers and stacking for this.

    1. Is there any way I could make sure this function recursion_clustering not increase any memory and only using static memory? I have tried to use so much pointers as possible to prevent memory duplication.

    Not without a complete program rewrite. It is possible to design recursive functions that the compiler might inline, which then do not come with any stack peak usage. But in order for that to be possible, the recursion must typically be a so called tail call, where the recursive call is at the very bottom of the function. The function should also be declared static to ensure internal linkage. But even then there's no guarantee of the compiler being able to inline.

    As it turns out, the best practice is to avoid recursion entirely.


    So how to solve it? You can replace any recursive function with a loop, but in that case there is no automatic stacking of parameters. One way to solve that is to develop your own stack-like data type and have a custom made stack on the side and push/pop in that one from the loop. This stack type could then use malloc or whatever you like internally. On the other hand, you have a ton of parameters there - in case you actually don't need a local copy of each and every one of them, you can just store the result in a single variable shared by all iterations of the loop.

    Another way is to integrate a parent/child or prev/next system in the data itself. For example if the purpose of the recursion was to search through a BST, you could give each node a parent pointer, so that when you run into a "leaf" with a null pointer both to the left and right, you can continue execution from the parent. The disadvantage of this method is that it consumes extra space together with the data.

    Either way, a fair bit of extra work. But once you've done it and done it proper, it should by definition execute faster, or at least not slower, than what you have currently. Since recursion comes with execution speed overhead.