Search code examples
c++performanceprimessieve-of-eratosthenes

Checking Uniqueness of Shifting Sieve of Eratosthenes


My code is as follows:

int main(){
    vector<int> primes;
    vector<int> primesSum;
    int tally = 1;
    bool primeFound = false;
    while (true) {
        int i = 0;
        while (!primeFound) {
            primeFound = true;
            tally++;
            for (i = 0; i < (int)primes.size(); i++) {
                if (tally == primesSum[i]) {
                    primeFound = false;
                    primesSum[i] += primes[i];
                }
            }
        }
        primeFound = false;
        primesSum.push_back(tally*2);
        primes.push_back(tally);
    }
}

It seems to generate prime numbers fine but I was wondering if I could increase its efficiency by implementing the rule that all primes are 6n +/- 1 after 2 and 3. That would seem to sacrifice my achievement of initially empty vectors though.

I've seen prime number verifiers that make use of the 6n +/- i rule but not really in sieve programs, likely as it breaks the 2 filter.

Edit: as an aside I would prefer to not use modulo as that is another achievement for this program, unless a modulo is more efficient (time-wise not space-wise) than what I currently have.


Solution

  • Here is some code that I wrote to efficiently generate primes using a sieve that only considers integers that are +-1 mod 6. It is written as a generator using a coroutine library, but the logic is equivalent. It uses a vector of uint64_t's as a bit mask.

    There are no modulo's related to computing the primes, but they are used to access the bitmap.

    Update

    I extracted my original code from the generator and created a simple function that puts the primes in a vector. One method uses the +-1 mod 2 search and the other +-1 mod 6 search. I also extracted the OP's code into a function.

    I ran some simple timing measurements on an M1 Max (arm64).

    function primes<1M primes<10M primes<100M
    sieve_index 12904 ms still running...
    sieve_2n 2 ms 30 ms 222 ms
    sieve_6n. 1 ms 15 ms 132 ms

    The code and build instructions can be found at the GitHub repo cpp-core/so

    Code

    auto sieve_mod6_prime_seq(int max = int{1} << 20) {
        std::vector<int> primes;
        primes.push_back(2);
        primes.push_back(3);
    
        auto max_index = max / 3;
        auto bits_per = sizeof(uint64_t) * CHAR_BIT;
        auto nwords = (bits_per + max_index - 1) / bits_per;
        std::vector<uint64_t> words(nwords);
    
        words[0] |= 1;
        size_t wdx = 0;
        while (wdx < nwords) {
            auto b = std::countr_one(words[wdx]);
            auto p = 3 * (64 * wdx + b) + 1 + (b bitand 1);
            if (b < 64 and p < max) {
                primes.push_back(p);
    
                for (auto j = p; j < max; j += 6 * p) {
                    auto idx = j / 3;
                    auto jdx = idx / 64;
                    auto jmask = uint64_t{1} << (idx % 64);
                    words[jdx] |= jmask;
                }
    
                for (auto j = 5 * p; j < max; j += 6 * p) {
                    auto idx = j / 3;
                    auto jdx = idx / 64;
                    auto jmask = uint64_t{1} << (idx % 64);
                    words[jdx] |= jmask;
                }
            }
            else {
                ++wdx;
            }
        }
        return primes;
    }
    
    auto sieve_mod2_prime_seq(int max = int{1} << 20) {
        std::vector<int> primes;
    
        auto bits_per = 2 * sizeof(uint64_t) * CHAR_BIT;
        auto nwords = (bits_per + max - 1) / bits_per;
        std::vector<uint64_t> words(nwords);
        if (max > 2)
            primes.push_back(2);
        words[0] |= 1;
        size_t idx = 0;
        while (idx < nwords) {
            auto bdx = 2 * std::countr_one(words[idx]) + 1;
            auto p = 128 * idx + bdx;
            if (bdx < 128 and p < max) {
                primes.push_back(p);
                for (auto j = p; j < max; j += 2 * p) {
                    auto jdx = j / 128;
                    auto jmask = uint64_t{1} << ((j % 128) / 2);
                    words[jdx] |= jmask;
                }
            }
            else {
                ++idx;
            }
        }
        return primes;
    }
    
    auto sieve_index(int max = int{1} << 20) {
        std::vector<int> primes;
        std::vector<int> primesSum;
        int tally = 1;
        bool primeFound = false;
        while (tally < max) {
            int i = 0;
            while (!primeFound) {
                primeFound = true;
                tally++;
                for (i = 0; i < (int)primes.size(); i++) {
                    if (tally == primesSum[i]) {
                        primeFound = false;
                        primesSum[i] += primes[i];
                    }
                }
            }
            primeFound = false;
            primesSum.push_back(tally*2);
            primes.push_back(tally);
        }
        return primes;
    }
    
    template<class Work>
    void measure(std::ostream& os, std::string_view desc, Work&& work) {
        chron::StopWatch timer;
        timer.mark();
        work();
        auto millis = timer.elapsed_duration<std::chrono::milliseconds>().count();
        os << fmt::format("{:>30s}: {} ms", desc, millis) << endl;
    }
    
    int tool_main(int argc, const char *argv[]) {
        ArgParse opts
            (
             argValue<'n'>("number", 1000, "Number of primes"),
             argFlag<'v'>("verbose", "Verbose diagnostics")
             );
        opts.parse(argc, argv);
        auto n = opts.get<'n'>();
        // auto verbose = opts.get<'v'>();
    
        measure(cout, "sieve_index", [&]() { sieve_index(n); });
        measure(cout, "sieve_2n", [&]() { sieve_mod2_prime_seq(n); });
        measure(cout, "sieve_6n", [&]() { sieve_mod6_prime_seq(n); });
        return 0;
    }
    

    End Update

    You can easily adapt it to construct a vector of primes by replacing each co_yield with vec.push_back.

    coro::Generator<uint64_t> sieve_mod6_prime_seq(uint64_t max = uint64_t{1} << 20) {
        co_yield 2;
        co_yield 3;
    
        auto max_index = max / 3;
        auto bits_per = sizeof(uint64_t) * CHAR_BIT;
        auto nwords = (bits_per + max_index - 1) / bits_per;
        std::vector<uint64_t> words(nwords);
    
        words[0] |= 1;
        size_t wdx = 0;
        while (wdx < nwords) {
            auto b = std::countr_one(words[wdx]);
            auto p = 3 * (64 * wdx + b) + 1 + (b bitand 1);
            if (b < 64 and p < max) {
                co_yield p;
    
                for (auto j = p; j < max; j += 6 * p) {
                    auto idx = j / 3;
                    auto jdx = idx / 64;
                    auto jmask = uint64_t{1} << (idx % 64);
                    words[jdx] |= jmask;
                }
    
                for (auto j = 5 * p; j < max; j += 6 * p) {
                    auto idx = j / 3;
                    auto jdx = idx / 64;
                    auto jmask = uint64_t{1} << (idx % 64);
                    words[jdx] |= jmask;
                }
            }
            else {
                ++wdx;
            }
        }
        co_return;
    }