Search code examples
arraysalgorithmsortingsubsetproduct

Find Nth number from a sorted array formed by multiplying any two or more consecutive natural numbers


I recently encountered this problem:

There is a (increasingly) sorted array formed by multiplying any two or more consecutive natural numbers.

2, 6, 12, 20, 24, 30, 42, 56, 60, 72 ...

Ex. 2 is formed by two consecutive natural numbers 1 and 2: 2 = 1×2. And 6 = 2×3 OR 1×2×3, 20 = 4×5.

If n is given as a parameter, find the nth number from the above array and return.

Limitation

  • 1 ≤ n ≤ 1000000
  • n is given only when the answer is smaller than 1012

So here I was able to find O(n2) solution, but I want to know if there is a better solution.

My O(n2) JS solution:

function solution(n) {
    // Find all possible product subsets of [1, ..., n] = [1x2, 2x3, 4x5, ..., 1x2x...xn]
    // return Nth index of this product subset array

    // 1 ~ n+1 array
    const nums = Array.from({ length: n+1 }, (_, i) => i + 1);
    const set = new Set();

    // Find all possible product subsets
    for (let i = 0; i < nums.length; i++) {
        let accu = 1;
        for (let j = i; j < nums.length; j++) {
            accu *= nums[j];
            if (i !== j) set.add(accu);
        }
    }

    // Sort and return n-1 index value
    return Array.from(set).sort((a,b) => a - b)[n-1];
}

Thanks for the help :)


Solution

  • The following implementation is based on a min-heap (std::priority_queue in C++), that memorizes the "best" future candidates.

    One important point is to treat the basic solutions k *(k+1) differently. As it is likely that these numbers are in the majority, this allows to greatly reduce the size of the heap.

    At each given time, we either decide to use a k(k+1)number, or to use the current top value of the min-heap. Each used value led to insertion of a new candidate in the min-heap.

    Another aspect is to only insert in the heap the values less then the estimated max value, n(n+1).

    The complexity is estimated to be O(n log M), where M is the mean size of the heap.

    For n = 10^6, the programme measures that the maximum size of the heap is equal to 9998, much less than n.

    On my PC, I get the result for n = 10^6 in 11 ms. Result: 977410038240

    Here is the C++ code.

    This code memorizes all the sequence, mainly for debugging. In practice, if we only need the nth value, such memorization can be avoided. The measurement of the maximum heap (useful for debugging) size can be removed too, if efficiency is still a concern.

    #include <iostream>
    #include <vector>
    #include <string>
    #include <queue>
    #include <chrono>
    
    template <typename T>
    void print (const std::vector<T> &A , const std::string &s = "") {
        std::cout << s;
        for (const T& val: A) {
                std::cout << val << " ";
        }
        std::cout << "\n";
    }
    
    struct Prod {
        long long p;    // product
        int last;       // last integer in the product
    };
    
    long long int consecutive_products (int n) {
        std::vector<long long> products;        // not strictly needed, for debugging
        products.reserve(n);
        products.push_back (2); products.push_back (6);
        
        long long max_val = (long long) n * (n+1);
        
        auto comp = [] (const Prod& x1, const Prod& x2) {
            if (x1.p == x2.p) return x1.last > x2.last;
            return x1.p > x2.p;
        };
        std::priority_queue<Prod, std::vector<Prod>, decltype(comp)> candidates(comp);
        if (n <= 2) return products[n-1];
        candidates.push ({24, 4});      // 2*3*4 -> extension of 2*3
        
        long long int prod_simple = 12; // = 3*4 - simple products k(k-1) are dealt with differently
        int rank_simple = 4;
        
        int index = 2;
        long long current_val = products[index - 1];
        Prod best;
        long long minval;
        int max_size = 0;
        while (index < n) {
            if (candidates.empty()) {
                minval = max_val;
            } else {
                best = candidates.top();
                minval = best.p;
            }
            if (minval <= prod_simple) {
                candidates.pop();
                long long new_product = minval * (best.last + 1);
                if (new_product < max_val) {
                    candidates.push ({new_product, best.last + 1});
                }
            } else {
                minval = prod_simple;
                long long new_product = prod_simple * (rank_simple + 1);
                if (new_product < max_val) {
                    candidates.push ({new_product, rank_simple + 1});
                }
                prod_simple = (long long) rank_simple * (rank_simple + 1);
                rank_simple++;
            }
            if (minval > current_val) {
                products.push_back(minval);
                current_val = minval;
                index++;
            }
            int size = candidates.size();
            if (size > max_size) max_size = size;
        }
        
        if (n <= 20) print (products, "Products: ");
        std::cout << "max heap size = " << max_size << std::endl;
        return minval;
    }
    
    int main() {
        int n;
        std::cout << "Enter n: ";
        std::cin >> n;
        auto t1 = std::chrono::high_resolution_clock::now();
        auto ans = consecutive_products (n);
        auto t2 = std::chrono::high_resolution_clock::now();
        std::cout << ans << std::endl;
        auto duration = std::chrono::duration_cast<std::chrono::microseconds>( t2 - t1 ).count();
        std::cout << "duration = " << duration << " micro-s" << std::endl;  
        return 0;
    }