Search code examples
c++stltime-complexityspace-complexity

Complexity of std::inplace_merge


So I have two sorted vectors and I wanted to merge them into a single sorted vector without using an additional vector.

Since there's this condition I can't use std::merge, so I tried std::inplace_merge.

I have this function which takes both vectors and two numbers m and n which are the number of elements in each vector. Here's how I solved this problem:

void merge(vector<int>& nums1, int m, vector<int>& nums2, int n) 
    {
         auto it = std::remove_if(std::execution::par, nums1.begin() + m, nums1.end(), 
             [](const int val)
             {
                 return val == 0;
             }
         );
         nums1.erase(it, nums1.end());

        nums1.insert(nums1.end(), nums2.begin(), nums2.end());
        std::inplace_merge(std::execution::par, nums1.begin(), nums1.begin() + nums1.size() - nums2.size(), nums1.end());
    }

Example case

vec 1 = [1,2,3,0,0,0]
m = 3
vec2 = [2,5,6]
n = 3

Expected Output: [1, 2, 2, 3, 5, 6]

My attempt at finding space and time complexity

this is all working fine, now what I want is to find out the time and space complexity. I came up with the following:

The space complexity would be the size of the total vector right? In this case I believe it's O(m + n)

As for the time complexity, std::remove_if would take at most m , so will std::vector::erase and std::vector::insert would take n (the number of elements in the second vector) and finally std::inplace_merge would take O(n log n).

So in the end we have O(n log n + 2m + n), am I correct ?


Solution

  • Your calculations are mostly correct, but:

    1. You're mixing your parameters; the n you use changes meaning between number of elements in one vector and number of elements in the combined vector.
    2. Traditionally, big-O notation ignores constant factors, and any terms that are dwarfed by a higher big-O expression. So 2m is the same thing as m, and if n is a superset of m, then n log n means you ignore the 2m term entirely.
    3. Similarly, you combine terms that are effectively the same thing (so having m for nums1 and n for nums2 is pretty meaningless; the n log n work is on the combined size; you'd usually ignore the distinction between the two inputs).
    4. Your space complexity (if you're not dropping constant factors) is arguably O(m + 2n), since you're not consuming the elements from the second vector, you're copying them (so you end up with one copy of each element in nums1, and two of each element in nums2). But of course, per #2/#3, that's more detail than you'd use for big-O notation.

    So in big-O terms, you'd express the time complexity as simply O(n log n) (n being the combined size of both inputs), and the space complexity as O(n) (it needs up to 2n space, if nums1 is empty and nums2 contains all the data, but constant factors don't matter to big-O). As the input sizes grow, those are the dominating terms, everything else becomes irrelevant from a theoretical perspective (even if those practical concerns might actually matter in real life).

    Note that, in practice, std::inplace_merge will attempt to allocate temporary space if possible, and if it succeeds, the time complexity (in terms of number of comparisons needed) drops to O(n), so the real world space used may be higher than you expect, while the time may grow more slowly than expected.