c++algorithm data-structures memory-management allocation

Why std::array<int, 10> is faster than std::vector<int>(10) in this scenario?

The task (from a Bulgarian judge, click on "Език" to change it to English):

I am given N, where N belongs to [3; 1000000] different numbers X, where X belongs to [0; 18446744073709551616).

How many numbers can be formed by rearranging the digits of a different number in the input ?

Example:

25 21 10242 42210 52 24021

Answer:

Explanation:

There are 5 different numbers namely {25, 52} and {10242, 42210, 24021}

25 -> 52 and 52 -> 25

10242 -> 42210 and 42210 -> 10242

10242 -> 24021 and 24021 -> 10242

42210 -> 24021 and 24021 -> 42210

lhs -> rhs means that rhs can be formed from the digits of lhs

I solved the task successfully. Here is my solution:

#include <iostream>
#include <string>
#include <vector>
#include <array>
#include <algorithm>
#include <iterator>

namespace Const {
    std::string::size_type constexpr sz_max{ 20 };
}

int main() {
    // 0
    std::ios_base::sync_with_stdio(false);
    std::cin.tie(nullptr);
    // 1
    int n;
    std::cin >> n;
    int i;
    std::string x;
    x.reserve(Const::sz_max);
    std::vector<std::array<int, 10>> lut(n);
    for (i = 0; i != n; ++i) {
        std::cin >> x;
        for (auto const& ch : x) {
            ++lut[i][ch - '0'];
        }
    }
    // 2
    std::sort(lut.begin(), lut.end());
    // 3
    auto prev{ lut.cbegin() };
    decltype(prev) curr;
    auto cnt{ 1 }, ans{ 0 };
    for (curr = std::next(prev); curr != lut.cend(); ++curr) {
        if (*curr == *prev) {
            ++cnt;
        }
        else {
            if (cnt != 1) {
                ans += cnt;
            }
            prev = curr;
            cnt = 1;
        }
    }
    if (cnt != 1) {
        ans += cnt;
    }
    prev = curr;
    cnt = 1;
    // 4
    std::cout << ans << '\n';
    return 0;
}

The execution time of this version is 550-600 ms.

If I change the definition of lut in // 1 from

std::vector<std::array<int, 10>> lut(n);

std::vector<std::vector<int>> lut(n, std::vector<int>(10));

the execution time increases drastically to 1050-1150 ms.

In both cases heap allocations have to take place but why the first ones are faster than the second ones ?

Solution

std::array doesn't make a separate heap allocation for its elements like std::vector does, i.e.,

sizeof(std::array<int,16>) == sizeof(int) * 16

All of the elements are stored directly in the object. Because of this, all the data in a vector<array<int,10>> is stored in the same contiguous allocation.