Search code examples
multithreadinghaskellparallel-processingmulticore

Multicore programming in Haskell - Control.Parallel


I'm trying to learn how to use the Control.Parallel module, but I think I didn't get it right.

I'm trying to run the following code (fibs.hs).

import Control.Parallel

fib :: Int -> Int
fib 0 = 0
fib 1 = 1
fib n = p `par` (q `pseq`  (p + q))
    where
      p = fib (n-1)
      q = fib (n-2)


main = print $ fib 30

I compiled this with:

ghc -O2 --make -threaded fibs.hs

And then I get the following results executing this program (output of a Python script that runs each program 100 times and returns the average and standard deviation of the execution time):

./fibs +RTS -N1 -> avg= 0.060203 s, deviation = 0.004112 s  
./fibs +RTS -N2 -> avg= 0.052335 s, deviation = 0.006713 s  
./fibs +RTS -N3 -> avg= 0.052935 s, deviation = 0.006183 s  
./fibs +RTS -N4 -> avg= 0.053976 s, deviation = 0.007106 s  
./fibs +RTS -N5 -> avg= 0.055227 s, deviation = 0.008598 s  
./fibs +RTS -N6 -> avg= 0.055703 s, deviation = 0.006537 s  
./fibs +RTS -N7 -> avg= 0.058327 s, deviation = 0.007526 s  

My questions are:

  1. What exactly is happening when I evaluate:

    a `par` (b `pseq` (a + b))   ?
    

    I understand that a par b is supposed to hint the compiler about calculating a in parallel with b and return b. OK. But what does pseq do?

  2. Why do I see such a small performance increase? I'm running this in an Intel Core 2 Quad machine. I'd expect that running with -N5 or -N6 wouldn't make a real difference in performance or that the program would actually start to perform very badly. But why do I see no improvement from -N2 to -N3 and why is the initial improvement so small?


Solution

  • As Don explained, the problem is that you are creating too many sparks. Here's how you might rewrite it to get a good speedup.

    import Control.Parallel
    
    cutoff :: Int
    cutoff = 20
    
    parFib :: Int -> Int
    parFib n | n < cutoff = fib n
    parFib n = p `par` q `pseq` (p + q)
        where
          p = parFib $ n - 1
          q = parFib $ n - 2
    
    fib :: Int -> Int
    fib 0 = 0
    fib 1 = 1
    fib n = fib (n - 1) + fib (n - 2)
    
    main :: IO ()
    main = print $ parFib 40
    

    demonstration:

    [computer ~]$ ghc --make -threaded -O2 Main.hs
    [1 of 1] Compiling Main             ( Main.hs, Main.o )
    Linking Main ...
    [computer ~]$ time ./Main +RTS -N1
    102334155
    
    real    0m1.509s
    user    0m1.450s
    sys     0m0.003s
    [computer ~]$ time ./Main +RTS -N2
    102334155
    
    real    0m0.776s
    user    0m1.487s
    sys     0m0.023s
    [computer ~]$ time ./Main +RTS -N3
    102334155
    
    real    0m0.564s
    user    0m1.487s
    sys     0m0.030s
    [computer ~]$ time ./Main +RTS -N4
    102334155
    
    real    0m0.510s
    user    0m1.587s
    sys     0m0.047s
    [computer ~]$