Search code examples
f#functional-programmingprimesfactorization

When creating an intermediary value should I store it?


I am trying to learn F# so I paid a visit to Project Euler and I am currently working on Problem 3.

The prime factors of 13195 are 5, 7, 13 and 29.

What is the largest prime factor of the number 600851475143?

Some things to consider:

  1. My first priority is to learn good functional habits.
  2. My second priority is I would like it to be fast and efficient.

Within the following code I have marked the section this question is regarding.

let isPrime(n:int64) = 
    let rec check(i:int64) = 
        i > n / 2L or (n % i <> 0L && check(i + 1L))
    check(2L) 

let greatestPrimeFactor(n:int64) =
    let nextPrime(prime:int64):int64 = 
        seq { for i = prime + 1L to System.Int64.MaxValue do if isPrime(i) then yield i }  
        |> Seq.skipWhile(fun v -> n % v <> 0L) 
        |> Seq.hd
    let rec findNextPrimeFactor(number:int64, prime:int64):int64 =
        if number = 1L then prime else 
            //************* No variable
            (fun p -> findNextPrimeFactor(number / p, p))(nextPrime(prime))    
            //*************               
            //************* Variable
            let p = nextPrime(prime)
            findNextPrimeFactor(number / p, p) 
            //*************
    findNextPrimeFactor(n, 2L) 

Update

Based off some of the feedback I have refactored the code to be 10 times faster.

module Problem3

module private Internal =
    let execute(number:int64):int64 = 
        let rec isPrime(value:int64, current:int64) = 
            current > value / 2L or (value % current <> 0L && isPrime(value, current + 1L))   
        let rec nextPrime(prime:int64):int64 = 
            if number % prime = 0L && isPrime(prime, 2L) then prime else nextPrime(prime + 1L)       
        let rec greatestPrimeFactor(current:int64, prime:int64):int64 =
            if current = 1L then prime else nextPrime(prime + 1L) |> fun p -> greatestPrimeFactor(current / p, p)
        greatestPrimeFactor(number, 2L)


let execute() = Internal.execute(600851475143L)

Update

I would like to thank everyone for there advice. This latest version is a compilation of all the advice I received.

module Problem3

module private Internal =
    let largestPrimeFactor number = 
        let rec isPrime value current = 
            current > value / 2L || (value % current <> 0L && isPrime value (current + 1L))   
        let rec nextPrime value =
            if number % value = 0L && isPrime value 2L then value else nextPrime (value + 1L) 
        let rec find current prime =
            match current / prime with
            | 1L -> prime
            | current -> nextPrime (prime + 1L) |> find current
        find number (nextPrime 2L)            

let execute() = Internal.largestPrimeFactor 600851475143L

Solution

  • Functional programming becomes easier and more automatic with practice, so don't sweat it if you don't get it absolutely right on the first try.

    With that in mind, let's take your sample code:

    let rec findNextPrimeFactor(number:int64, prime:int64):int64 =
            if number = 1L then prime else 
                //************* No variable
                (fun p -> findNextPrimeFactor(number / p, p))(nextPrime(prime))    
                //*************               
                //************* Variable
                let p = nextPrime(prime)
                findNextPrimeFactor(number / p, p) 
                //*************
    

    Your no variable version is just weird, don't use it. I like your version with the explicit let binding.

    Another way to write it would be:

    nextPrime(prime) |> fun p -> findNextPrimeFactor(number / p, p)
    

    Its ok and occasionally useful to write it like this, but still comes across as a little weird. Most of the time, we use |> to curry values without needing to name our variables (in "pointfree" style). Try to anticipate how your function will be used, and if possible, re-write it so you can use it with the pipe operator without explicit declared variables. For example:

    let rec findNextPrimeFactor number prime =
        match number / prime with
        | 1L -> prime
        | number' -> nextPrime(prime) |> findNextPrimeFactor number'
    

    No more named args :)


    Ok, now that we have that out of the way, let's look at your isPrime function:

    let isPrime(n:int64) = 
        let rec check(i:int64) = 
            i > n / 2L or (n % i <> 0L && check(i + 1L))
        check(2L) 
    

    You've probably heard to use recursion instead of loops, and that much is right. But, wherever possible, you should abstract away recursion with folds, maps, or higher order functions. Two reasons for this:

    1. its a little more readable, and

    2. improperly written recursion will result in a stack overflow. For example, your function is not tail recursive, so it'll blow up on large values of n.

    I'd rewrite isPrime like this:

    let isPrime n = seq { 2L .. n / 2L } |> Seq.exists (fun i -> n % i = 0L) |> not
    

    Most of the time, if you can abstract away your explicit looping, then you're just applying transformations to your input sequence until you get your results:

    let maxFactor n =
        seq { 2L .. n - 1L }                        // test inputs
        |> Seq.filter isPrime                       // primes
        |> Seq.filter (fun x -> n % x = 0L)         // factors
        |> Seq.max                                  // result
    

    We don't even have intermediate variables in this version. Coolness!


    My second priority is I would like it to be fast and efficient.

    Most of the time, F# is going to be pretty comparable with C# in terms of speed, or it's going to be "fast enough". If you find your code takes a long time to execute, it probably means you're using the wrong data structure or a bad algorithm. For a concrete example, read the comments on this question.

    So, the code I've written is "elegant" in the sense that its concise, gives the correct results, and doesn't rely on any trickery. Unfortunately, its not very fast. For start:

    • it uses trial division to create a sequence of primes, when the Sieve of Eratosthenes would be much faster. [Edit: I wrote a somewhat naive version of this sieve which didn't work for numbers larger than Int32.MaxValue, so I've removed the code.]

    • read Wikipedia's article on the prime counting function, it'll give you pointers on calculating the first n primes as well as estimating the upper and lower bounds for the nth prime.

    [Edit: I included some code with a somewhat naive implementation of a sieve of eratosthenes. It only works for inputs less than int32.MaxValue, so it probably isn't suitable for project euler.]