Search code examples
c++algorithmdata-structuresnumber-theorygreatest-common-divisor

How to maximize the GCD of n positive integers with minimum number of removals from the sequence that represents them?


I'm trying to solve a competitive programming task which I cannot find out effective solution to tackle the last 5 test cases with.

You are given with t (t >= 1 && t <= 5) inquiries each of which consists of n numbers num (num >= 1 && num <= 1000000). The sequence that represents each inquiry isn’t sorted and there can be repeating numbers in it. The sum of all ns is less than 1000000. Let's call the GCD of all numbers in a single inquiry x. The task is to find out what is the minimum number of removals that have to be made in order to maximize x. Time limit is 0.7 s.

Consider that this (8, 2, 6, 4, 10, 12) is the inquiry that I’m given. GCD (8, 2, 6, 4, 10, 12) = 2 but If I remove 2, 6 and 10 GCD (8, 4, 12) = 4. I increased initial GCD x from 2 to 4 with 3 removals namely 2, 6 and 10. The answer of this inquiry is 3.

The best ideas I've come up with so far are the following ones:

#include <iostream>
#include <vector>
#include <map>
#include <numeric>
#include <limits>
#include <algorithm>
#include <iterator>

int main() {
    std::ios_base::sync_with_stdio(false);
    std::cin.tie(nullptr);
    int t;
    std::cin >> t;
    int i, n, tmp, j, num, div, max, min, rem;
    std::vector<int> ans;
    for (i = 0; i != t; ++i) {
        std::cin >> n;
        tmp = 0;
        std::map<int, int> buckets;
        for (j = 0; j != n; ++j) {
            std::cin >> num;
            tmp = std::gcd(tmp, num);
            for (div = 1; div * div <= num; ++div) {
                if (num % div == 0) {
                    ++buckets[div];
                    if (div * div != num) {
                        ++buckets[num / div];
                    }
                }
            }
        }
        max = tmp;
        min = std::numeric_limits<int>::max();
        for (auto const& elem : buckets) {
            if (elem.second != n && elem.first > max) {
                rem = n - elem.second;
                if (rem < min) {
                    max = elem.first;
                    min = rem;
                }
            }
        }
        ans.emplace_back(min);
    }
    std::copy(ans.cbegin(), std::prev(ans.cend()),
        std::ostream_iterator<int>(std::cout, "\n"));
    std::cout << ans.back() << '\n';
    return 0;
}

I’m calculating x (tmp) and splitting each number (num) into its divisors (div). I’m constantly updating buckets which is a std::map<div, count of numbers in the particular inquiry t that this div can divide>. I loop through buckets and find out which is the divisor (elem.first) that is greater than max (x) and at the same time leads to minimum number of removals.

#include <iostream>
#include <vector>
#include <numeric>
#include <algorithm>
#include <limits>

int main() {
    std::ios_base::sync_with_stdio(false);
    std::cin.tie(nullptr);
    int t;
    std::cin >> t;
    int i, n, tmp_gcd, tmp_max, j, num,
        ans, k, tmp_sum, l, tmp_min;
    for (i = 0; i != t; ++i) {
        std::cin >> n;
        std::vector<int> inq_db;
        inq_db.reserve(n);
        tmp_max = tmp_gcd = 0;
        for (j = 0; j != n; ++j) {
            std::cin >> num;
            inq_db.emplace_back(num);
            tmp_gcd = std::gcd(tmp_gcd, num);
            tmp_max = std::max(tmp_max, num);
        }
        std::vector<int> buckets(tmp_max + 1);
        for (auto const& elem : inq_db) {
            ++buckets[elem];
        }
        ans = std::numeric_limits<int>::max();
        for (k = tmp_gcd + 1; k <= tmp_max; ++k) {
            tmp_sum = 0;
            for (l = k; l <= tmp_max; l += k) {
                tmp_sum += buckets[l];
            }
            tmp_min = n - tmp_sum;
            if (tmp_min != n && tmp_min < ans) {
                ans = tmp_min;
            }
        }
        std::cout << ans << '\n';
    }
    return 0;
}

I’m calculating x (tmp_gcd) and finding out maximum value of num (tmp_max) in the inquiry t that I’m working with. I initialize buckets with max + 1 zeroes (counters). I do this in order to avoid initializing buckets with 1000000 counters each time. Instead of splitting the number into its divisors I’m using the counters in buckets to count the numbers that are equal to their index. 5th element in buckets is counting how many numbers with the value 5 I have in the inquiry, 100th element – value 100 and so on. I loop through buckets starting from index x + 1 and find out how many numbers with GCD = k I have in buckets. The idea here is that numbers k, 2k, 3k, 4k, 5k and so on have GCD = k. I find out which is the number (ans) that leads to minimum number of removals.

The problem with both my ideas is that they are given with time limit exceeded on the last 5 test cases. I need some help to optimize them or If this is not possible to be given a new one. Thanks in advance for your time.

@Damien I reworked your logic this way:

#include <iostream>
#include <map>
#include <vector>
#include <numeric>
#include <cmath>
#include <algorithm>

std::vector<int> split(int num) {
    std::vector<int> ret;
    if ((num & 1) == 0) {
        ret.emplace_back(2);
        do {
            num >>= 1;
        } while ((num & 1) == 0);
    }
    int div, div_lim;
    for (div = 3, div_lim = static_cast<int>(std::sqrt(num)); div <= div_lim; div += 2) {
        if (num % div == 0) {
            ret.emplace_back(div);
            do {
                num /= div;
            } while (num % div == 0);
            div_lim = static_cast<int>(std::sqrt(num));
        }
    }
    if (num != 1) {
        ret.emplace_back(num);
    }
    return ret;
}

int main() {
    std::ios_base::sync_with_stdio(false);
    std::cin.tie(nullptr);
    int t;
    std::cin >> t;
    int i, n, x, j, num, ans;
    std::map<int, int>::const_iterator it;
    for (i = 0; i != t; ++i) {
        std::cin >> n;
        std::vector<int> seq;
        seq.reserve(n);
        x = 0;
        for (j = 0; j != n; ++j) {
            std::cin >> num;
            seq.emplace_back(num);
            x = std::gcd(x, num);
        }
        if (x != 1) {
            for (auto& elem : seq) {
                elem /= x;
            }
        }
        std::map<int, int> lookup;
        for (auto const& elem : seq) {
            auto ret = split(elem);
            for (auto const& div : ret) {
                ++lookup[div];
            }
        }
        it = std::max_element(lookup.cbegin(), lookup.cend(),
            [](std::pair<const int, int> const& lhs, std::pair<const int, int> const& rhs) {
                return lhs.second < rhs.second;
            });
        ans = n - it->second;
        std::cout << ans << '\n';
    }
    return 0;
}

but sadly I hit TLE on the last 5 test cases.

@TonyK It is in Bulgarian because the task is given on a Bulgarian competitive programming competition.

If I have this (4 6 8 10 12 14) test GCD is 2. Now let's split each number into its divisors:

4 (1, 2, 4)

6 (1, 2, 3, 6)

8 (1, 2, 4, 8)

10 (1, 2, 5, 10)

12 (1, 2, 3, 4, 6, 12)

14 (1, 2, 7, 14)

The candidates for GCD > 2 are the following ones:

14->I have to remove all the numbers but 14->5 removals

12->I have to remove all the numbers but 12->5 removals

10->I have to remove all the numbers but 10->5 removals

8->I have to remove all the numbers but 8->5 removals

7->I have to remove all the numbers but 14->5 removals

6->I have to remove 4, 8, 10 and 14->4 removals

5->I have to remove all the numbers but 10->5 removals

4->I have to remove 6, 10 and 14->3 removals

3->I have to remove 4, 8, 10 and 14->4 removals

The answer is 3 removals and they can be achieved at GCD = 4.


Solution

  • Here is a rather fast solution.

    The first step consists in calculating the global gcd and divide all numbers by it.

    The global GCD is now equal to 1. The problem is now to find the minimum number of removals such that this GC is no longer equal to 1.

    For that, we perform the prime decomposition of each number, and find the most frequent prime in them.

    The result is the array size minus the number of times this prime is present.

    The complexity is dominated by the prime decomposition: O(n sqrt(m)), where m is the maximum data value.

    #include <iostream>
    #include <vector>
    #include <numeric>
    #include <algorithm>
    #include <limits>
    #include <map>
    #include <cmath>
    
    //  For test
    void print (const std::vector<int> &v) {
            for (auto &x: v) {
                    std::cout << x << " ";
            }
            std::cout << std::endl;
    }
    
    //  Get the prime factors of a number (multiplicity is not important here)
    
    std::vector<int> get_primes (int n) {
        int n_sav = n;
        if (n <= 1) return {};
        std::vector<int> primes;
    
        if (n % 2 == 0) {
            primes.push_back(2);
            n /= 2;
            while (n%2 == 0) {n /= 2;}
        }
        int max_prime = sqrt(n);
        int p = 3;
        while (p <= max_prime) {
            if (n % p == 0) {
                primes.push_back(p);
                n/= p;
                while (n%p == 0) {n /= p;}
                max_prime = sqrt(n);
            }
            p += 2;
        }
        if (n != 1) {
            primes.push_back(n);
        }
        //std::cout << n_sav << ": ";
        //print (primes);
        return primes;
    }
    
    
    int min_removals (std::vector<int>& numbers) {
        int n = numbers.size();
        if (n == 1) return -1;
        int current_gcd = numbers[0];
        for (int i = 1; i < n; ++i) current_gcd = std::gcd (current_gcd, numbers[i]);
        std::cout << "current GCD = " << current_gcd << "\n";
        if (current_gcd != 1) {
            for (int i = 0; i < n; ++i) numbers[i] /= current_gcd;
        }
        std::map<int, int> list_primes;
        for (auto x: numbers) {
            auto primes = get_primes(x);
            for (int i: primes) {
                list_primes[i]++;
            }
        }
        int most_frequent = 0;
        for (const auto& p: list_primes) {
            if (p.second > most_frequent) {
                most_frequent = p.second;
            }
        }
        
        return n - most_frequent;
    }
    
    int main() {
        std::ios_base::sync_with_stdio(false);
        std::cin.tie(nullptr);
        int t;
        std::cin >> t;
        while (t--) {
            int n;
            std::cin >> n;
            std::vector<int> numbers (n);
            for (int i = 0; i < n; ++i) {
                std::cin >> numbers[i];
            }
            auto ans = min_removals (numbers);
            std::cout << ans << "\n";
        }
        return 0;
    }
    

    With the following variant, the speed is slightly increased by incorporating the prime decomposition in the main function. This avoids some useless data copies. It appears that the gain is not negligeable.

    int min_removals_new (std::vector<int>& numbers) {
        int n = numbers.size();
        if (n == 1) return -1;
        int current_gcd = numbers[0];
        for (int i = 1; (i < n) && (current_gcd > 1); ++i) current_gcd = std::gcd (current_gcd, numbers[i]);
        if (current_gcd != 1) {
            for (int i = 0; i < n; ++i) numbers[i] /= current_gcd;
        }
        std::map<int, int> list_primes;
        
        for (auto x: numbers) {
            if (x == 1) continue;
            if (x % 2 == 0) {
                list_primes[2]++;
                x /= 2;
                while (x%2 == 0) {x /= 2;}
            }
            int max_prime = sqrt(x);
            int p = 3;
            while (p <= max_prime) {
                if (x % p == 0) {
                    list_primes[p]++;
                    x/= p;
                    while (x%p == 0) {x /= p;}
                    max_prime = sqrt(x);
                }
                p += 2;
            }
            if (x != 1) {
                list_primes[x]++;
            }
        }
           
        int most_frequent = 0;
        for (const auto& p: list_primes) {
            if (p.second > most_frequent) {
                most_frequent = p.second;
            }
        }
        
        return n - most_frequent;
    }