Search code examples
memory-leaksd

How to fix D "memory leaks"


So I've been searching for a solution to this problem for some time. I've written a program to take data from two separate text files, parse it, and output to another text file and an ARFF file for analysis by Weka. The problem I'm running into is that the function I wrote to handle the data read and parsing operations doesn't de-allocate memory properly. Every successive call uses an additional 100MB or so and I need call this function over 60 times over the course of the function. Is there a way to force D to de-allocate memory, with respect to arrays, dynamic arrays, and associative arrays in particular?

An example of my problem:

struct Datum {
     string Foo;
     int Bar;
} 

Datum[] Collate() {
    Datum[] data;
    int[] userDataSet;
    int[string] secondarySet;
    string[] raw = splitLines(readText(readFile)).dup;

    foreach (r; raw) {
        userDataSet ~= parse(r);
        secondarySet[r.split(",").dup] = parseSomeOtherWay(r);
    }

    data = doSomeOtherCalculation(userDataSet, secondarySet);

    return data;
}

Solution

  • Are the strings in the returned data still pointing inside the original text file?

    Array slicing operations in D do not make a copy of the data - instead, they just store a pointer and length. This also applies to splitLines, split, and possibly to doSomeOtherCalculation. This means that as long as a substring of the original file text exists anywhere in the program, the entire file's contents cannot be freed.

    If the data you're returning is only a small fraction of the size of the text file you're reading, you can use .dup to make a copy of the string. This will prevent the small strings from pinning the entire file's contents in memory.