Search code examples
cmakec++20simdtbbg++12

what brings about a dependency on tbb?


Using g++12 and CMake. I have a source file

holes5.cpp

which does not

#include <execution>

and does not need to link to tbb. Now if I add

#include <execution>

it does not require linking to tbb either. So what exact step does it take to start depending on tbb (thus requiring linking to tbb). I am confused.

Installed tbb via

sudo apt install libtbb-dev

In CMakeList.txt:

list(APPEND CMAKE_MODULE_PATH "deps/tbb/cmake/")
find_package(TBB REQUIRED)

set (SOURCES holes5.cpp)
add_executable(holes5 ${SOURCES})

set (SOURCES par_unseq.cpp)
add_executable(par_unseq ${SOURCES})
target_link_libraries(par_unseq PUBLIC TBB::tbb)

par_unseq.cpp:

#include <cstdint>
#include <iostream>
#include <chrono>
#include <cmath>
#include <numeric>
#include <utility>
#include <algorithm>
#include <execution>
using namespace std;

double f(double x) noexcept
{
        const int N = 1000;
        for (int i = 0; i < N; ++i) {
            x = log2(x);
            x = cos(x);
            x = x * x + 1;
        }
        return x;
}

double sum(const vector<double>& vec)
{
        double sum = 0;
        for (auto x : vec)
            sum += x;
        return sum;
}

int main()
{
        cout << "Hey! Your machine has " << thread::hardware_concurrency() << " cores!\n";
    // Make an input vector.
    const int N = 1000000;
    vector<double> vecInput(N);
    for (int i = 0; i < N; ++i)
        vecInput[i] = i + 1;

    {   // Case #1: Plain transform, no parallelism.
        auto startTime = chrono::system_clock::now();
        vector<double> vecOutput(N);
        transform(vecInput.cbegin(), vecInput.cend(), vecOutput.begin(), f);
        auto endTime = chrono::system_clock::now();
        chrono::duration<double> diff = endTime - startTime;
        cout << "1. sum = " << sum(vecOutput) << ", time = " << diff.count() << "\n";
    }
    {   // Case #2: Transform with parallel unsequenced.
        vector<double> vecOutput(N);
        auto startTime = chrono::system_clock::now();
        transform(execution::par_unseq,
                    vecInput.cbegin(), vecInput.cend(), vecOutput.begin(), f);
        auto endTime = chrono::system_clock::now();
        chrono::duration<double> diff = endTime - startTime;
        cout << "2. sum = " << sum(vecOutput) << ", time = " << diff.count() << "\n";
    }

}

/* Output:
    Hey! Your machine has 4 cores!
    1. sum = 1.60346e+06, time = 43.7997
    2. sum = 1.60346e+06, time = 10.8235
*/

Solution

  • According to the libstdc++ documentation (see Note 3), you must link -ltbb whenever you include the <execution> header.

    If you don't actually use any of the functions from <execution>, or you do but the compiler manages to inline them all, then your program might link even without -ltbb. This might be dependent on a specific version of the header or library, compiler version, or compilation options. And even then, it does not guarantee that the program will work correctly. It could be that including the header changes the compilation in some way, and that -ltbb includes initialization code needed for this to work correctly. Even if it doesn't now, it might in the future.

    So I think the answer is very simple: if you include the header, link the library. If you don't need the <execution> features, then don't include the header in the first place.