Search code examples
c++multithreadingconcurrencyc++17tbb

tbb:concurrent_hash_map<K,V>: sample code for Intel Threading Building Blocks (TBB)


Looking for sample code to use tbb::concurrent_hash_map<K,V> from Intel Threading Building Blocks (TBB).

I can insert, but I cannot seem to read the values back.

The official Intel documentation appears to be somewhat lacking on the sample code side.

Update

The best docs are in "Pro TBB: C++ Parallel Programming with Threading Building Blocks" by Voss. Download this book for free (it's public domain).

Ignore the Intel docs. They are essentially a collection of function signatures.


Solution

  • Intel TBB is open source, and on GitHub:

    https://github.com/intel/tbb
    

    To install TBB, I used vcpkg which is compatible with Linux, Windows and Mac. Yes, vcpkg is from Microsoft, but it is 100% cross-platform, open source, and very popular.

    Linux:

    ./vcpkg search tbb              # Find the package.
    ./vcpkg install tbb:x64-linux   # Install the package.
    

    Windows:

    vcpkg search tbb                # Find the package.
    vcpkg install tbb:x64-windows   # Install the package.
    

    Compile:

    • Compatible with any modern compiler including MSVC, GCC, LLVM, Intel Compiler (ICC), etc. I used CMake for gcc.

    Can also download the source and extract the headers and libraries into the source tree, this works just as well.

    Code.

    #include "tbb/concurrent_hash_map.h" // For concurrent hash map.
    
    tbb::concurrent_hash_map<int, string> dict;
    typedef tbb::concurrent_hash_map<int, string>::accessor dictAccessor; // See notes on accessor below.   
    
    print("  - Insert key, method 1:\n");   
    dict.insert({1,"k1"});
    print("    - 1: k1\n");
    
    print("  - Insert key, method 2:\n");
    dict.emplace(2,"k2");
    print("    - 2: k2\n");
    
    string result;
    
    {
        print("  - Read an existing key:\n");   
        dictAccessor accessor;
        const auto isFound = dict.find(accessor, 2);
        // The accessor functions as:
        // (a) a fine-grained per-key lock (released when it goes out of scope).
        // (b) a method to read the value.
        // (c) a method to insert or update the value.
        if (isFound == true) {
            print("    - {}: {}\n", accessor->first, accessor->second);
        }
    }
    
    {
        print("  - Atomically insert or update a key:\n");  
        dictAccessor accessor;
        const auto itemIsNew = dict.insert(accessor, 4);
        // The accessor functions as:
        // (a) a fine-grained per-key lock (released when it goes out of scope).
        // (b) a method to read the value.
        // (c) a method to insert or update the value.
        if (itemIsNew == true) {
            print("    - Insert.\n");
            accessor->second = "k4";
        }
        else {
            print("    - Update.\n");
            accessor->second = accessor->second + "+update";
        }
        print("    - {}: {}\n", accessor->first, accessor->second);     
    }
    
    {
        print("  - Atomically insert or update a key:\n");          
        dictAccessor accessor;
        const auto itemIsNew = dict.insert(accessor, 4);
        // The accessor functions as:
        // (a) a fine-grained per-key lock which is released when it goes out of scope.
        // (b) a method to read the value.
        // (c) a method to insert or update the value.
        if (itemIsNew == true) {
            print("    - Insert.\n");
            accessor->second = "k4";
        }
        else {
            print("    - Update.\n");
            accessor->second = accessor->second + "+update";
        }
        print("    - {}: {}\n", accessor->first, accessor->second);     
    }
    
    {
        print("  - Read the final state of the key:\n");            
        dictAccessor accessor;
        const auto isFound = dict.find(accessor, 4);
        print("    - {}: {}\n", accessor->first, accessor->second);
    }
    

    Printing uses {fmtlib} for printing; can replace with cout <<.

    Output:

    - Insert key, method 1:
      - 1: k1
    - Insert key, method 2:
      - 2: k2
    - Read an existing key:
      - 2: k2
    - Atomically insert or update a key:
      - Insert.
      - 4: k4
    - Atomically insert or update a key:
      - Update.
      - 4: k4+update
    - Read the final state of the key:
      - 4: k4+update
    

    Other hash maps

    • See: https://tessil.github.io/2016/08/29/benchmark-hopscotch-map.html
    • See: std::unordered_map. This has a more standard API, and is thread safe in many situations, see: unordered_map thread safety. Suggest using this, if possible, as it has a simpler API.
    • There is also the concurrent_unordered_map from Intel TBB. It is essentially the same thing, a key/value map. However, it is much older, much much lower level, and more difficult to use. One has to supply a hasher, a equality operator, and an allocator. There is no sample code anywhere, even in the official Intel docs. I never got it working, despite months of occasional attempts. It may be obsolete, as it is not mentioned in said free book (it only covers concurrent_hash_map). Not recommended.

    Update: Reader/Writer Locks

    There are actually two accessors, one is a read lock, one is a write lock:

    • const_accessor
    • accessor

    If using find, use const_accessor which is a read lock. If using insert or erase, use accessor which is a write lock (i.e. it will wait until any reads are done, and block further reads until it is done).

    This is effectively equivalent to a reader/writer lock, but on a single dictionary key in the dictonary, rather than the entire dictionary.

    Update

    Final part of the learning curve: for key writes, nothing happens until the accessor goes out of scope. So any locks are held for no more than a few machine instructions, probably using CAS (Compare And Swap).

    Comparing this to a database, the scope of the accessor is like a transaction. When the accessor goes out of scope, the entire transaction is committed to the hashmap.

    Update

    The free book mentioned above has fantastic performance tips in the chapter on concurrent_hash_map.

    Conclusion

    The API for this hash map is powerful but somewhat awkward. However, it supports fine-grained, per-key locks on insert/update. Any locks are only held for a handful of machine instructions, using CAS. This is something that few other hashmaps can offer, in any language. Recommend starting with std::unordered_map for simplicity; it is thread safe as long as the two threads do not write to the same key. If blazingly fast performance is required, there is an option to either refactor, or write a compatible wrapper on top with [] accessors and insert_or_update().