Search code examples
memory-managementgarbage-collectiond

Is this the right way to combine Garbage collected with none Garbage collected code in D


I have been showing interest in systems level programming languages D caught my attention, so I was told the Dlang by default the GC is on but I can turn it off however I went beyond that I wanted to combine both manual memory management with automatic memory management here is my code

import core.stdc.stdio;
import std.stdio;
import std.array;
import core.stdc.stdlib;

void main() @trusted
{
  string word = "hello";
  int sizeOfString = word.length;
  string* words = cast(string*)malloc(char.sizeof*sizeOfString);

  string[] letters = "hello".split("");
  for(auto i = 0; i < letters.length;i++) {
    words[i] = letters[i];
  }
  writeln(*(words[4]).ptr);
  free(words);
  
}




Solution

  • The code doesn't do what you think it does. Let's walk through what's happening. (The imports are fine, so I'm omitting them for brevity.)

    void main() @trusted
    {
      string word = "hello";
    
      /* This should be size_t, not int */
      int sizeOfString = word.length;
    
      /* First, you don't need char.sizeof. It's always 1.
         More importantly, string isn't the same as char*. It's an array, which
         contains both a pointer and a length. If the compiler happens to put the pointer
         first, your code will work by coincidence, but as soon as you try to do
         words.length, it'll crash or return nonsense. You don't return a pointer
         to string here; you return a pointer to char.
      */
      string* words = cast(string*)malloc(char.sizeof*sizeOfString);
    
      /* Redundant. We can just loop through the characters (once the type of words is fixed) */
      string[] letters = "hello".split("");
      for(auto i = 0; i < letters.length;i++) {
        words[i] = letters[i];
      }
    
      /* More roundabout stuff that won't be necessary once we fix the earlier bits */
      writeln(*(words[4]).ptr);
    
      free(words);
      
    }
    

    This code does what you seem to have been trying to:

    void main() {
        string word = "hello";
    
        /* word2 points at word.length bytes of memory. We need to track
           that length elsewhere (we can use word.length for that below.) */
        char* word2 = cast(char*)malloc(word.length);
    
        /* No need to split word into an array of length-1 strings anymore.
           We just copy the characters over. */
        foreach (i; 0..word.length) {
            word2[i] = word[i];
        }
    
        /* And this line is now just like it would be in C, as well */
        writeln(word2[4]);
    
        free(word2);
    }
    
    

    I was going to also show code that did allocate an array of length-1 strings, but that doesn't really work, since strings are dynamic arrays (dynamic arrays of immutable char, but dynamic arrays, nevertheless), and there's a lot of magic happening in the implementation that isn't defined by the specification. In particular, setting string foo; foo.length = 42 doesn't just set the field; it allocates memory, and in doing so, it assumes that foo.ptr is pointing to GC-managed memory. We could probably get it pointing at malloced memory and get the length right by using memset (possibly after checking system-endianness), but it'd be a crock and probably undefined behavior, so I didn't include that in the answer.

    An option for convenient handling of malloced arrays is to use slices:

    void main() {
        string word = "hello";
        char* word2 = cast(char*)malloc(word.length);
    
        foreach (i; 0..word.length) {
            word2[i] = word[i];
        }
    
        /* The slice points at the same memory as word2, but
           is an actual array type.
        */
        char[] s = word2[0..word.length];
        writeln(word2[4]);// 'o'
        s[4] = 'x';
        writeln(word2[4]);// 'x'
    
        free(word2);
    }
    

    You should probably not attempt to resize that slice. https://dlang.org/articles/d-array-article.html#append-on goes into some of the gritty details of what happens when you resize a slice, and while it normally works with copy-on-write semantics, that seems to assume that it was already pointing at GC-managed memory (so that it has the used bit set). I'm not sure if it'd be well-defined to resize a slice that points at memory you malloced up yourself.