Search code examples
haskelltypesprimesprime-factoringtype-signature

Factorising an integer recursively and some questions about functions in Haskell


I've just started to learn Haskell after coming from Python, and I've got a couple of questions about functions. I've written the following code:

--generating prime list
primes = sieve [2..]
sieve (p:ps) = p : sieve [x | x <- ps, mod x p /= 0]

--factorising function
--takes an input of a number and a list of primes and outputs a list of its prime factors
factorise (n,ps)
    | mod n head ps /= 0    = div n head ps : factorise (div n head ps, ps)
    | otherwise             = factorise (n,tail ps)

Firstly, when I try to compile, I get an error relating to n, saying I cannot construct the infinite type: a ~ [a] -> a, why is this?

Secondly, while I understand the logic behind creating an infinite list, why do you not have to explicitly state the types of the function sieve, are the types implied? Would I have to, for the factorise function?

Lastly, is there a more concise way of writing the above algorithm (which I understand is horribly efficient)?


Solution

  • Adding some parentheses here and there fixes the problem:

    factorise (n, ps)
        | mod n (head ps) /= 0  = div n (head ps) : factorise (div n (head ps), ps)
        | otherwise             = factorise (n, tail ps)
    

    Parentheses are used for grouping in Haskell. mod n (head ps) is an application of mod to two arguments, n and head ps; without the parens mod n head ps is an application of mod to three arguments, n, head and ps`, and this just doesn't typecheck.

    Indeed after the error is fixed all the types are successfully inferred, as

    primes :: Integral a => [a]
    sieve   :: Integral a => [a] -> [a]
    factorise :: Integral a => (a, [a]) -> [a]
    

    Now you can specialize them if you so wish, as

    primes :: [Integer]
    sieve   :: [Integer] -> [Integer]
    factorise :: (a ~ Integer) => (a, [a]) -> [a]
    

    (the latter way of writing the type for factorise needs the GADTs extension enabled).

    For your original code with the type signature included,

    factorise :: Integral a => (a, [a]) -> [a]
    factorise (n,ps)
        | mod n head ps /= 0    = div n head ps : factorise (div n head ps, ps)
        | otherwise             = factorise (n,tail ps)
    

    the error message is essentially the same. It says "Could not deduce (a ~ ([a] -> a))" but still shows that as far as the compiler can see, the a type must also be [a] -> a at the same time (so it must also be a ~ [a] -> ([a] -> a), a ~ [[a] -> a] -> ([a] -> ([a] -> a)), etc. etc. (hence the "infinite" type reference in the first error message)).

    If that were indeed the type you intended it could be made legal by naming it and making it explicitly recursive, as

    data T a = MkT ([T a] -> T a)
    

    This is allowed. Haskell is perfectly capable of having recursive types. After all, simple list type is recursive, as if defined by

    data [] a = [] | a : ([] a)
    data L a = Nil | Cons a (L a)
    

    I'll leave addressing the efficiency issues in factorise for later. With sieve though it is pretty straightforward: its main problem is that it only produces one prime at a time, whereas it is perfectly capable of producing more, much more than that, at each one step.

    Can you find a way how?


    Your factorise function is a perfectly fine rendering of the simple trial division factorization algorithm (except for few simple errors). One way to improve conciseness (and subsequently correctness!) is to introduce interim variables with let (or, better, where, to be able to use them in guards as well), as in

    factorise (n, ps)
      | mod n f /= 0  =                        -- is the test correct?
                        v : factorise (v, ps)  -- is it really v : ... ?
      | otherwise     = factorise (n, tail ps)
         where
           f = head ps 
           v = div n f
    

    ... except it never stops! You must include an additional test to stop producing the list of prime factors.

    Can you find a way how? :)