reduce_parallel is not a thread-safe function?

I want to call parallel_reduce to sum the vector elements. But I find that if the vector elements is enough, the result is not correct. Please help me how to use this function.

// prepare data
    const size_t allNum = 1000000;
    std::vector<double> a;
    for (int i = 0; i < allNum; ++i)
    {
        a.push_back(double(i + 1));
    }

    // λ func
    auto f = [&]() -> double {
        return tbb::parallel_reduce(tbb::blocked_range<size_t>(0, allNum), 
            0.0,
            [&](const tbb::blocked_range<size_t>& r, double init) -> double {
            for (int i = r.begin(); i < r.end(); ++i)
            {
                init += a[i];
            }
            return init;
        },
            [](double f, double s) -> double {
            return f + s;
        }
            /*std::plus<double>()*/);
    };

    // call λ func, get the result
    double correctResult = (1.0 + 1000000.0) * 500000.0;
    double sum = f(); // sum != correctResult
    // sum is different every loop

Solution

I tried running the above code. It was working fine and got the correct results too!

For more information about parallel_reduce, refer to the below links: https://software.intel.com/content/www/us/en/develop/documentation/tbb-documentation/top/intel-threading-building-blocks-developer-reference/algorithms/parallelreduce-template-function.html

https://link.springer.com/content/pdf/10.1007%2F978-1-4842-4398-5.pdf

Thanks, Santosh