I am very new to Haskell. What could I replace the "undefined" in the definition below so that the "frequencies" calculates the frequency distribution of items in the input list. For example, the expression "frequencies [True, False, True]" should produce a distribution where True has a frequency of 2 and False has a frequency of 1.
I can add new top-level function definitions, but not modify any other part of the definition of "frequencies". In particular, I can't add any additional named arguments or remove the call to "foldMap" in the definition of "frequencies".
type Dist a = a -> Sum
frequencies :: Eq a => [a] -> Dist a
frequencies = foldmap = undefined
You can implement this by following the types using GHC typed holes, entering an underscore _
to ask the compiler what type of an expression should be.
import Data.Semigroup (Sum(..))
-- Using the standard 'Sum' for illustration.
type Dist a = a -> Sum Int
frequencies :: (Eq a) => [a] -> Dist a
frequencies = foldMap _
Here, the compiler reports that _ :: a -> Dist a
. So we’re given each a
value in the input, and must produce a corresponding Dist a
, which are then combined into a final result with foldMap
. This relies on the fact that there are Semigroup
and Monoid
instances for functions, which just combine their results:
instance (Semigroup m) => Semigroup (a -> m) where
f <> g = \ x -> f x <> g x
instance (Monoid m) => Monoid (a -> m) where
mempty = \ _x -> mempty
Dist a
is a function that accepts an a
and returns a Sum
, which may be combined using its Semigroup
/Monoid
instances to add the results. So we want to introduce a lambda:
frequencies :: (Eq a) => [a] -> Dist a
frequencies = foldMap (\ x -> _)
The hole now has type Dist a
, which is a function, so we introduce another parameter:
frequencies :: (Eq a) => [a] -> Dist a
frequencies = foldMap (\ x -> \ y -> _)
Now we must produce a Sum Int
value. All we have are two a
values, and a constraint that a
can be compared with Eq
. So let’s introduce an if
to test their equality:
frequencies :: (Eq a) => [a] -> Dist a
frequencies = foldMap (\ x -> \ y -> if x == y then _ else _)
What Sum
values should we use for the holes? Presumably we want to add 1 to the total if the values are equal, and nothing if they differ. So we can use Sum 1
for the true case, and mempty
(or Sum 0
) for the false case. We can also collapse the lambda using the normal syntactic sugar for multiple parameters.
frequencies :: (Eq a) => [a] -> Dist a
frequencies = foldMap (\ x y -> if x == y then Sum 1 else mempty)
Now we can call this function on a list, and it will produce a function that gives us the frequency for a particular element.
> f = frequencies "AAB"
> :t f
f :: Dist Char
> f 'A'
Sum {getSum = 2}
> f 'B'
Sum {getSum = 1}
From this, we can build more interesting things, like a histogram:
import Data.List (nub)
histogram :: (Ord a) => [a] -> [(a, Int)]
histogram xs = let
keys = nub (sort xs)
frequency = frequencies xs
in zip keys (map (getSum . frequency) keys)
> histogram "AAB"
[('A',2),('B',1)]