Search code examples
vala

How can I use a HashMap of List of String in Vala?


I am trying to use a HashMap of Lists of strings in Vala, but it seems the object lifecycle is biting me. Here is my current code:

public class MyClass : CodeVisitor {
    GLib.HashTable<string, GLib.List<string>> generic_classes = new GLib.HashTable<string, GLib.List<string>> (str_hash, str_equal);

    public override void visit_data_type(DataType d) {
        string c = ...
        string s = ...

        if (! this.generic_classes.contains(c)) {
            this.generic_classes.insert(c, new GLib.List<string>());
        }

        unowned GLib.List<string> l = this.generic_classes.lookup(c);

        bool is_dup = false;
        foreach(unowned string ss in l) {
            if (s == ss) {
                is_dup = true;
            }
        }
        if ( ! is_dup) {
            l.append(s);
        }
    }

Note that I am adding a string value into the list associated with a string key. If the list doesn't exist, I create it.

Lets say I run the code with the same values of c and s three times. Based some printf debugging, it seems that only one list is created, yet each time it is empty. I'd expect the list of have size 0, then 1, and then 1. Am I doing something wrong when it comes to the Vala memory management/reference counting?


Solution

  • GLib.List is the problem here. Most operations on GLib.List and GLib.SList return a modified pointer, but the value in the hash table isn't modified. It's a bit hard to explain why that is a problem, and why GLib works that way, without diving down into the C. You have a few choices here.

    1. Use one of the containers in libgee which support multiple values with the same key, like Gee.MultiMap. If you're working on something in the Vala compiler (which I'm guessing you are, as you're subclassing CodeVisitor), this isn't an option because the internal copy of gee Vala ships with doesn't include MultiMap.
    2. Replace the GLib.List instances in the hash table. Unfortunately this is likely going to mean copying the whole list every time, and even then getting the memory management right would be a bit tricky, so I would avoid it if I were you.
    3. Use something other than GLib.List. This is the way I would go if I were you.

    Edit: I recently added GLib.GenericSet to Vala as an alternative binding for GHashTable, so the best solution now would be to use GLib.HashTable<string, GLib.GenericSet<string>>, assuming you're okay with depending on vala >= 0.26.

    If I were you, I would use GLib.HashTable<string, GLib.HashTable<unowned string, string>>:

    private static int main (string[] args) {
      GLib.HashTable<string, GLib.HashTable<unowned string, string>> generic_classes =
        new GLib.HashTable<string, GLib.HashTable<unowned string, string>> (GLib.str_hash, GLib.str_equal);
    
      for (int i = 0 ; i < 3 ; i++) {
        string c = "foo";
        string s = i.to_string ();
        unowned GLib.HashTable<unowned string, string>? inner_set = generic_classes[c];
    
        stdout.printf ("Inserting <%s, %s>, ", c, s);
    
        if (inner_set == null) {
          var v = new GLib.HashTable<unowned string, string> (GLib.str_hash, GLib.str_equal);
          inner_set = v;
          generic_classes.insert ((owned) c, (owned) v);
        }
    
        inner_set.insert (s, (owned) s);
    
        stdout.printf ("container now holds:\n");
        generic_classes.foreach ((k, v) => {
            stdout.printf ("\t%s:\n", k);
            v.foreach ((ik, iv) => {
                stdout.printf ("\t\t%s\n", iv);
              });
          });
      }
    
      return 0;
    }
    

    It may seem hackish to have a hash table with the key and value having the same value, but this is actually a common pattern in C as well, and specifically supported by GLib's hash table implementation.

    Moral of the story: don't use GLib.List or GLib.SList unless you really know what you're doing, and even then it's generally best to avoid them. TBH we probably would have marked them as deprecated in Vala long ago if it weren't for the fact that they're very common when working with C APIs.