Search code examples
loopsoptimizationrustnested-loopsprimes

Why is making a nested loop using a 2nd function faster than making it all in one function in rust?


I am trying to learn rust, and as an exercise I am making a program that calculates all the prime numbers below certain number. Then, by accident, I found that making a loop within another loop is faster when the secondary loop is made inside another function and calling it, than when making it immediately inside of the loop. I do not understand why this can be, and I am hoping that you could explain me why this happens.

The particular case

Here is my little program:

//to get system time
use std::time::{SystemTime};
//to read the input
use std::io;

fn main() {
    println!("Welcome to the prime number calculation program. Please enter a number, and the quantity of prime numbers between 2 and that number will be calculated.");
    let mut target_input = String::new();
    io::stdin().read_line(&mut target_input).expect("Error: failed to read number!");
    let target: u64 = target_input.trim().parse().unwrap();
    println!("Calculating now prime numbers until {}.", target);
    println!("Using algorithm A.");
    let a = || {
        algo_a(target.clone())
    };
    println!("{} prime numbers found.", execute_with_timer(a));
    //algo B
    println!("Using algorithm B.");
    let b = || {algo_b(target.clone())};
    println!("{} prime numbers found.", execute_with_timer(b));
    //algo c
    println!("Using algorithm C.");
    let c = || {algo_c(target.clone())};
    println!("{} prime numbers found.", execute_with_timer(c));
}

fn execute_with_timer<F: Fn()->u64>(closure:F)-> u64 {
    let time_start = SystemTime::now();
    let result = closure();
    let time_end = SystemTime::now();
    let time_passed = time_end.duration_since(time_start).expect("Clock may have gone backwards");
    println!("{:?} passed.", time_passed);
    return result;
}

//algo b but without external function called 
fn algo_c(target:u64)->u64{
    let mut counter: u64 = 0;
    //collection of found numbers
    let mut found_primes: Vec<u64> = Vec::new();
    //check the range
    let mut is_prime_bool: bool;
    //inclusive range
    for i in 2u64..=target {
        is_prime_bool = true;
        for p in found_primes.iter() {
            //if i number can be divided by any previously found p prime, then it is not a prime
            if i % p == 0 {
                is_prime_bool = false
            }
        }
        if is_prime_bool {
            counter = counter + 1;
            //add to the list of found numbers
            found_primes.push(i);
        }
    }
    return counter;
}

//algo with register of already found primes
fn algo_b(target:u64)->u64 {
    let mut counter: u64 = 0;
    //collection of found numbers
    let mut found_primes: Vec<u64> = Vec::new();
    //check the range
    //inclusive range
    for i in 2u64..=target {
        if is_prime_reg(i, &found_primes) {
            counter = counter + 1;
            //add to the list of found numbers
            found_primes.push(i);
        }
    }
    return counter;
}

//function with register
//WARNING: This function will only work if used in an ordered way, in an loop increasing n+1
//a list of already found prime numbers must be supplied
fn is_prime_reg(number:u64, found_primes: &Vec<u64>)->bool{
    //first check dividing by the known primes
    for p in found_primes.iter() {
        if number % p == 0 {
            return false;
        }
    }
    return true;
}

//the slowest algorithm
fn algo_a(target: u64)->u64{
    let mut counter: u64 = 0;
    //inclusive range
    for i in 2u64..=target {
        if is_prime_simplest(i) {
            counter = counter + 1;
        }
    }
    return counter;
}
//Simple algorithm: the slowest traditional way of dividing by each lower number
fn is_prime_simplest(number: u64) -> bool{
    //try to find if it can make a exact division, if it can it is not prime.
    //exclusive range
    for i in 2u64..number {
        if number % i == 0 {
            return false;
        }
    }
    return true;
}

One of the algorithms (algo_b) is optimized to check if a number is prime only dividing by the already found primes, instead of by all numbers. For this algorithm I made the second loop happen by calling another function. The detail is that, while trying to invent a better algorithm, I casually made another version of this algo_b that is basically the same mathematically, but in the program, the second loop happens within the first function. I called this "algo_c". Then, when I compiled the code and executed it, it happened that the algo_c is much slower than the algo_b, even though they do mathematically the same, and worse, actually algo_c is even slightly slower than algo_a, the brute force way to calculate the prime numbers.

Times when using cargo run (test compilation):

Calculating now prime numbers until 100000.

Using algorithm A. 5.447370201s passed. 9592 prime numbers found.

Using algorithm B. 1.091195151s passed. 9592 prime numbers found.

Using algorithm C. 12.764870792s passed. 9592 prime numbers found.

Times when using cargo build -r (release compilation with optimizations):

Calculating now prime numbers until 100000.

Using algorithm A. 1.663481161s passed. 9592 prime numbers found.

Using algorithm B. 169.890757ms passed. 9592 prime numbers found.

Using algorithm C. 1.833567947s passed. 9592 prime numbers found.

I do not get to understand how the algo_c can be the slowest one. Changing the number type from u64 to u32 or u16 did not change the fact that algo_c was always the slowest. I do not see why anything of this happens.

Why is this happening?

Thank you very much.


Solution

  • In your algo_c helper function you have this:

    for p in found_primes.iter() {
        if number % p == 0 {
            return false;
        }
    }
    

    So, the loop will end and the function will return as soon as the first prime is found that the number is divisible by.

    In your algo_b you have this:

    for p in found_primes.iter() {
        if i % p == 0 {
            is_prime_bool = false
        }
    }
    

    The loop will continue even after the first match is found, to iterate all the primes in found_primes.

    To fix it, add break to exit the loop early:

    for p in found_primes.iter() {
        if i % p == 0 {
            is_prime_bool = false;
            break
        }
    }