Implementing a linspace
function, which takes Start, End and optionally Steps as its arguments and outputs a list of evenly distributed (numerical) values with first element being Start and the last being End. Length of resulting list is (Steps + 1). It's similar to MATLAB's linspace
.
One implementation is in Haskell, the other is in C. Guess who's billion times faster. Need help understanding what is wrong with the Haskell code.
Update for people, who asked some relevant information.
To compile C: gcc -o linspace linspace.c
gcc: 12.2.0
To compile Haskell: ghc -O2 -o linspace linspace.hs
ghc: 9.0.2
For benchmark use sh
's time linspace 0 10 100000
. My results are as follows.
C (test 1):
real 0m0.187s
user 0m0.076s
sys 0m0.012s
C (test 2):
real 0m0.132s
user 0m0.068s
sys 0m0.010s
Only two tests for C, because result is consistently within the range of these two.
Haskell:
*takes forever*
Haskell code.
{-# LANGUAGE BangPatterns #-}
import System.Environment
linspace :: Fractional a => a -> a -> Maybe Int -> [a]
linspace start end maybe = linspace' [] 0
where
stepCount =
case maybe of
Just i
| i > 0 -> i
| otherwise -> errorWithoutStackTrace "Number of steps must be > 0"
Nothing -> 100
stepSize = abs $ (end - start) / (fromIntegral stepCount)
linspace' !acc n
| n > stepCount = acc
| otherwise =
let !el = start + stepSize * (fromIntegral n)
in linspace' (acc ++ [el]) (n + 1)
main = do
argv <- getArgs
let argc = length argv
if argc /= 2 && argc /= 3
then errorWithoutStackTrace "linspace <start> <end> [steps]"
else return ()
let start = read $ argv !! 0 :: Double
end = read $ argv !! 1 :: Double
steps =
if argc == 3
then Just $ read (argv !! 2) :: Maybe Int
else Nothing
putStrLn . show $ linspace start end steps
C code.
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <math.h>
int
main (int argc, char **argv) {
if (argc != 3 && argc != 4) {
fprintf(stderr, "Usage: linspace <start> <end> [steps]\n");
return EXIT_FAILURE;
}
errno = 0;
double start, end;
int steps;
start = strtod(argv[1], NULL);
if (errno) {
perror(argv[1]);
return EXIT_FAILURE;
}
errno = 0;
end = strtod(argv[2], NULL);
if (errno) {
perror(argv[2]);
return EXIT_FAILURE;
}
errno = 0;
if (argc == 4) {
steps = (int) strtol(argv[3], NULL, 10);
if (errno) {
perror(argv[3]);
return EXIT_FAILURE;
} else if (steps <= 0) {
fprintf(stderr, "Number of steps must be > 0\n");
return EXIT_FAILURE;
}
} else
steps = 100;
double stepSize;
stepSize = fabs((start - end) / steps);
double *vals;
vals = malloc(sizeof(double) * (steps + 1));
for (int i = 0; i <= steps; i++) {
vals[i] = start + stepSize * i;
}
printf("[");
for (int i = 0; i <= steps; i++) {
if (i == steps)
printf("%f", vals[i]);
else
printf("%f, ", vals[i]);
}
printf("]\n");
}
Don't mind that I'm taking absolute values. Lets say it's not indented to be used with negative values.
The main problem with your Haskell code is here:
linspace' !acc n
| n > stepCount = acc
| otherwise =
let !el = start + stepSize * (fromIntegral n)
in linspace' (acc ++ [el]) (n + 1)
This is an O(n^2)
algorithm to construct the list because each acc ++ [e1]
expression requires the entire acc
list to be traversed before the [e1]
singleton can be added to the end.
If you delete the definition of linspace'
and replace the first linspace
line with:
linspace start end maybe = [start + fromIntegral n * stepSize | n <- [0..stepCount]]
you'll find that the resulting Haskell program is a little more competitive with the C version, maybe around 5 times slower instead of a billion.