algorithm sorting haskell lazy-evaluation

the way merge-sort faster than insertion-sort puzzles me

Just got my feet wet in sorting algorithm with Haskell. I've implemented insertion-sort and merge-sort

insert_sort :: (Ord a, Show a) => [a] -> [a]
insert_sort keys = foldr f [] keys
           where f key []        = [key]
                 f key acc       = insert key acc
                 insert y []     = [y]
                 insert y (x:xs)
                     | x < y     = x : insert y xs
                     | otherwise = y : x : xs

merge_sort :: (Ord a, Show a) => [a] -> [a]
merge_sort (x:[]) = [x]
merge_sort keys   = merge  (merge_sort (take len keys)) (merge_sort (drop len keys))
      where len         = length keys `div` 2
            merge :: [a] -> [a] -> [a]
            merge (x:xs) []     = (x:xs)
            merge []     (y:ys) = (y:ys)
            merge (x:xs) (y:ys) = if x <= y
                                  then x : merge (xs) (y:ys)
                                  else y : merge (x:xs) ys

Here's how I compared their efficiency:

insert_sort $ take 100000 $ randomRs (1,100000) $ mkStdGen 1 ::[Int]
merge_sort $ take 100000 $ randomRs (1,100000) $ mkStdGen 1 ::[Int]

Both of them starts to print out results after a short delay but merge-sort prints much faster. As we know, merge-sort is much faster than insertion-sort for large data sets. I thought that would be shown by how they give results (like a long delay versus a short one) not how they print results. Is it because I use foldr in insertion-sort? What's behind the scene?

EDIT: Thx guys. I've heard about lazy evaluation since I started to learn Haskell but yet got the hang of it. Would anybody illustrate a bit more with a small data set, say [5,2,6,3,1,4]? How is it possible to output results before finish sorting with foldr since the first elements comes at last?

Solution

Behind the scene is lazy evaluation. The start of the sorted lists is determined before the sort is complete, so it can be output before the work is finished. Since a mergesort is faster, the merge-sorted list is printed out faster.

As requested: how sorting [5,2,6,3,1,4] proceeds. I use insert_sort = foldr ins [] for brevity.

insert_sort [5,2,6,3,1,4]
  = foldr ins [] [5,2,6,3,1,4]
  = 5 `ins` foldr ins [] [2,6,3,1,4]
  = 5 `ins` 2 `ins` [6,3,1,4] ...
  = 5 `ins` 2 `ins` 6 `ins` 3 `ins` 1 `ins` 4 `ins` []
  = 5 `ins` 2 `ins` 6 `ins` 3 `ins` 1 `ins` (4:[])
  = 5 `ins` 2 `ins` 6 `ins` 3 `ins` (1:4:[])
  = 5 `ins` 2 `ins` 6 `ins` (1 : (3 `ins` (4:[])))
  = 5 `ins` 2 `ins` (1 : (6 `ins` (3 `ins` (4:[]))))
  = 5 `ins` (1 : (2 `ins` (6 `ins` (3 `ins` (4:[])))))
  = 1 : (5 `ins` (2 `ins` (6 `ins` (3 `ins` (4:[])))))  -- now 1 can be output
  = 1 : (5 `ins` (2 `ins` (6 `ins` (3:4:[]))))
  = 1 : (5 `ins` (2 `ins` (3 : (6 `ins` (4:[])))))
  = 1 : (5 `ins` (2 : (3 : (6 `ins` (4:[])))))
  = 1 : 2 : (5 `ins` (3 : (6 `ins` (4:[]))))            -- now 2 can be output
  = 1 : 2 : 3 : (5 `ins` (6 `ins` (4:[])))              -- now 3
  = 1 : 2 : 3 : (5 `ins` (4:6:[]))
  = 1 : 2 : 3 : 4 : (5 `ins` (6:[]))                    -- now 4
  = 1 : 2 : 3 : 4 : 5 : 6 : []                          -- done

And merge sort (abbreviations: merge = mg, merge_sort = ms):

merge_sort [5,2,6,3,1,4]
  = mg (ms [5,2,6]) (ms [3,1,4])
  = mg (mg (ms [5]) (ms [2,6])) (mg (ms [3]) (ms [1,4]))
  = mg (mg [5] (mg [2] [6])) (mg [3] (mg [1] [4]))
  = mg (mg [5] [2,6]) (mg [3] [1,4])
  = mg (2 : mg [5] [6]) (1 : mg [3] [4])
  = 1 : mg (2 : mg [5] [6]) (mg [3] [4])                -- now 1 can be output
  = 1 : mg (2 : mg [5] [6]) [3,4]
  = 1 : 2 : mg (mg [5] [6]) [3,4]                       -- now 2 can be output
  = 1 : 2 : mg [5,6] [3,4]
  = 1 : 2 : 3 : mg [5,6] [4]                            -- now 3
  = 1 : 2 : 3 : 4 : mg [5,6] []                         -- now 4
  = 1 : 2 : 3 : 4 : 5 : 6 : []                          -- now 5 and 6

Admittedly I've taken a few short cuts, but Haskell isn't the only lazy one.