haskell ghc currying pointfree partial-application

Point Free Style Required for Optimized Curry

Say we have a (contrived) function like so:

import Data.List (sort)

contrived :: Ord a => [a] -> [a] -> [a]
contrived a b = (sort a) ++ b

And we partially apply it to use elsewhere, eg:

map (contrived [3,2,1]) [[4],[5],[6]]

On the surface, this works as one would expect:

[[1,2,3,4],[1,2,3,5],[1,2,3,6]]

However, if we throw some traces in:

import Debug.Trace (trace)

contrived :: Ord a => [a] -> [a] -> [a]
contrived a b = (trace "sorted" $ sort a) ++ b
map (contrived $ trace "a value" [3,2,1]) [[4],[5],[6]]

We see that the first list passed into contrived is evaluated only once, but it is sorted for each item in [4,5,6]:

[sorted
a value
[1,2,3,4],sorted
[1,2,3,5],sorted
[1,2,3,6]]

Now, contrived can be rather simply translated to point-free style:

contrived :: Ord a => [a] -> [a] -> [a]
contrived a = (++) (sort a)

Which when partially applied:

map (contrived [3,2,1]) [4,5,6]

Still works as we expect:

[[1,2,3,4],[1,2,3,5],[1,2,3,6]]

But if we again add traces:

contrived :: Ord a => [a] -> [a] -> [a]
contrived a = (++) (trace "sorted" $ sort a)
map (contrived $ trace "a value" [3,2,1]) [[4],[5],[6]]

We see that now the first list passed into contrived is evaluated and sorted only once:

[sorted
a value
[1,2,3,4],[1,2,3,5],[1,2,3,6]]

Why is this so? Since the translation into pointfree style is so trivial, why can't GHC deduce that it only needs to sort a once in the first version of contrived?

Note: I know that for this rather trivial example, it's probably preferable to use pointfree style. This is a contrived example that I've simplified quite a bit. The real function that I'm having the issue with is less clear (in my opinion) when expressed in pointfree style:

realFunction a b = conditionOne && conditionTwo
  where conditionOne = map (something a) b
        conditionTwo = somethingElse a b

In pointfree style, this requires writing an ugly wrapper (both) around (&&):

realFunction a = both conditionOne conditionTwo
  where conditionOne = map (something a)
        conditionTwo = somethingElse a
        both f g x = (f x) && (g x)

As an aside, I'm also not sure why the both wrapper works; the pointfree style of realFunction behaves like the pointfree style version of contrived in that the partial application is only evaluated once (ie. if something sorted a it would only do so once). It appears that since both is not pointfree, Haskell should have the same issue that it had with the non-pointfree contrived.

Solution

If I understand correctly, you are looking for this:

contrived :: Ord a => [a] -> [a] -> [a]
contrived a = let a' = sort a in \b -> a' ++ b
                    -- or ... in (a' ++)

If you want the sort to be computed only once, it has to be done before the \b.

You are correct in that a compiler could optimize this. This is known as the "full laziness" optimization.

If I remember correctly, GHC does not always do it because it's not always an actual optimization, in the general case. Consider the contrived example

foo :: Int -> Int -> Int
foo x y = let a = [1..x] in length a + y

When passing both arguments, the above code works in constant space: the list elements are immediately garbage collected as they are produced.

When partially applying x, the closure for foo x only requires O(1) memory, since the list is not yet generated. Code like

let f = foo 1000 in f 10 + f 20  -- (*)

still run in constant space.

Instead, if we wrote

foo :: Int -> Int -> Int
foo x = let a = [1..x] in (length a +)

then (*) would no longer run in constant space. The first call f 10 would allocate a 1000-long list, and keep it in memory for the second call f 20.

Note that your partial application

... = (++) (sort a)

essentially means

... = let a' = sort a in \b -> a' ++ b

since argument passing involves a binding, as in let. So, the result of your sort a is kept around for all the future calls.