Search code examples
time-complexitycomplexity-theoryasymptotic-complexityspace-complexity

What is the time/space complexity of this function that calculates the cartesian product of vectors?


I am having trouble figuring out what the time and space complexity of this function is.

vector<vector<string>> stringCombinations(const vector<vector<string>> &values) {
    vector<vector<string>> results = {{}};

    for(const auto &vec: values) {
        vector<vector<string>> temp;

        for(const auto &r: results) {
            for(const auto &s: vec) {
                temp.push_back(r);
                temp.back().push_back(s);
            }
        }

        results = move(temp);
    }

    for(const auto &row: results) {
        for(const auto &s: row) {
            cout << s << " ";
        }
        cout << endl;
    }

    return results;
}

This function is just the Cartesian Product of strings. Given a vector of vectors of strings, it prints and returns all combinations of those strings.

When seeing 3 nested loops I immediately think it's roughly on the scale of O(n3) in terms of time complexity. However, that second loop changes in length as the function continues running. At first iteration, there is only one element within the vector which is the empty vector. At second iteration, it contains n vectors of length 1 where n is the length of the first vector of strings. So I'm really unsure of how that affects things.

As for space complexity, I am thinking it's somewhere around O(m*n) due to the 2D vector used to store the results. I'm just unsure of what exactly m and n are here.


Solution

  • The complexity of the stringCombinations() function that computes the cartesian product of n vectors of length k is O(n * k^n).

    Let v be the argument of the function stringCombinations() and let n denote v.size(). Let v1 ... vn denote the elements of v and let |u| denote the number of elements in the vector u.

    The function computes the cartesian product of the n vectors v1 ... vn. The size of the resulting vector results is |v1| * |v2| * .... * |vn|. Each element of results is a vector of size n. Let k denote max( |v1|,..., |vn|).

    Now we can estimate the complexity of the algorithm. Let f(i) denote the number of iterations of the second loop on the ith iteration of the 1st loop. Then we are going to get O(n * f(n) * k) as k is the upper bound on the number of iterations of the third loop.

    For f(n) we have the following formula. If ki is the number of strings in vi then f(i) = f(i-1) * k(i-1). As f(1) = 1, it is easy to see that f(n) = 1 * k1 * ... * k(n-1) = O(k^(n-1)). Then the complexity of the main loop is bounded by O(n * k^n).

    Update on 13.03.2018 Let us go into more details here (and clarify the notation and fix some mistakes in the process). On every iteration of the 1st loop the 2nd loop goes over the results vector. On the first iteration it contains 1 element {}. On the second iteration the size of results is equal to the size of the first element values that we denoted by k1. On the second iteration with vec = {s1,s2,...} each element r={s} will be replaced with k2 elements. {s,s1},{s,s2},..., therefore the size of the result vector after the second iteration will be equal to k1 * k2. After each subsequent iteration i the size of result would be multiplied by ki. This size will determine the number of iterations of the 2nd loop on the next iteration of the 1st loop. So *f(i+1) = 1 * k1 * ... * ki.

    An example would be handy. Let values = {{"1","2"},{"3","4","5"}}. Here n = 2. At the beginning of the first iteration we have

       results = {{}}, vec={"1","2"}, *k1* = 2, *f(1)*=1
    

    At the beginning of the second iteration we have

       results = {{"1"},{"2"}}, vec={"3","4","5"}, *k2*=3, *f(2)*=2
    

    At the end we have

       results = {{"1","3"},{"1","4"},{"1","5"},
                  {"2","3"},{"2","4"},{"2","5"}}
    

    And the number of operations would be T = f(1) * k1 + f(2) * k2. As k = max(k1,k2) = 3, we have T<=3 * (f(1) + f(2)) <= 3 * f(2) *2. End of update

    Also note that the number of elements in results is bounded by O(n * k^n), so this is also the complexity of the second loop.

    As you correctly say, if we let m = k^n, then the complexity is O(m * n). However this notation may be misleading as the input array can be seen as an n by k array, so it is better to give complexity estimation in terms of the dimensions of the input array, so it is O(n * k^n).