Search code examples
haskellpattern-matchingsemanticslanguage-implementation

Haskell: Why ++ is not allowed in pattern matching?


Suppose we want to write our own sum function in Haskell:

sum' :: (Num a) => [a] -> a
sum' [] = 0
sum' (x:xs) = x + sum' xs

Why can't we do something like:

sum' :: (Num a) => [a] -> a
sum' [] = 0
sum' (xs++[x]) = x + sum' xs

In other words why can't we use ++ in pattern matching ?


Solution

  • You can only pattern match on constructors, not on general functions.

    Mathematically, a constructor is an injective function: each combination of arguments gives one unique value, in this case a list. Because that value is unique, the language can deconstruct it again into the original arguments. I.e., when you pattern match on :, you essentially use the function

    uncons :: [a] -> Maybe (a, [a])
    

    which checks if the list is of a form you could have constructed with : (i.e., if it is non-empty), and if yes, gives you back the head and tail.

    ++ is not injective though, for example

    Prelude> [0,1] ++ [2]
    [0,1,2]
    Prelude> [0] ++ [1,2]
    [0,1,2]
    

    Neither of these representations is the right one, so how should the list be deconstructed again?

    What you can do however is define a new, “virtual” constructor that acts like : in that it always seperates exactly one element from the rest of the list (if possible), but does so on the right:

    {-# LANGUAGE PatternSynonyms, ViewPatterns #-}
    
    pattern (:>) :: [a] -> a -> [a]
    pattern (xs:>ω) <- (unsnoc -> Just (xs,ω))
     where xs:>ω = xs ++ [ω]
    
    unsnoc :: [a] -> Maybe ([a], a)
    unsnoc [] = Nothing
    unsnoc [x] = Just x
    unsnoc (_:xs) = unsnoc xs
    

    Then

    sum' :: Num a => [a] -> a
    sum' (xs:>x) = x + sum xs
    sum' [] = 0
    

    Note that this is very inefficient though, because the :> pattern-synonym actually needs to dig through the entire list, so sum' has quadratic rather than linear complexity.

    A container that allows pattern matching on both the left and right end efficiently is Data.Sequence, with its :<| and :|> pattern synonyms.