Search code examples
haskellparsec

Type inference fails when defining lexer using makeTokenParser


I have the following code:

module Lexer (lexer) where                                                                                                                                                                                                
import Text.Parsec                                                                                                                                                                                                        
import Text.Parsec.Token                                                                                                                                                                                                  
import Text.Parsec.Language                                                                                                                                                                                               

opChars = "<>+=*-/!:"                                                                                                                                                                                                     

def = emptyDef {                                                                                                                                                                                                          
    commentStart = "/*"                                                                                                                                                                                                   
  , commentEnd = "*/"                                                                                                                                                                                                     
  , identStart = letter                                                                                                                                                                                                   
  , identLetter = alphaNum                                                                                                                                                                                                
  , opStart = oneOf opChars                                                                                                                                                                                               
  , opLetter = oneOf opChars                                                                                                                                                                                              
  , reservedOpNames = ["+", "-", "*", "!=", "==", ":=", "<", "<=", ">", ">="]                                                                                                                                             
  , reservedNames = ["and", "not", "or", "p_", "when", "output", "of"]                                                                                                                                                    
  }

lexer = makeTokenParser def                                                                                                                                                                                               

However, when I try to import this file in ghci, I get the following error:

Prelude> :l Lexer.hs
[1 of 1] Compiling Lexer            ( Lexer.hs, interpreted )

Lexer.hs:11:18:
    No instance for (Stream s0 m0 Char) arising from a use of `letter'
    The type variables `s0', `m0' are ambiguous
    Relevant bindings include
      def :: GenLanguageDef s0 u m0 (bound at Lexer.hs:8:1)
    Note: there are several potential instances:
      instance Monad m =>
               Stream Data.ByteString.Internal.ByteString m Char
        -- Defined in `Text.Parsec.Prim'
      instance Monad m =>
               Stream Data.ByteString.Lazy.Internal.ByteString m Char
        -- Defined in `Text.Parsec.Prim'
      instance Monad m => Stream Data.Text.Internal.Lazy.Text m Char
        -- Defined in `Text.Parsec.Prim'
      ...plus two others
    In the `identStart' field of a record
    In the expression:
      emptyDef
        {commentStart = "/*", commentEnd = "*/", identStart = letter,
         identLetter = alphaNum, opStart = oneOf opChars,
         opLetter = oneOf opChars, reservedOpNames = ["+", "-", "*", ....],
         reservedNames = ["and", "not", "or", ....]}
    In an equation for `def':
        def
          = emptyDef
              {commentStart = "/*", commentEnd = "*/", identStart = letter,
               identLetter = alphaNum, opStart = oneOf opChars,
               opLetter = oneOf opChars, reservedOpNames = ["+", "-", ....],
               reservedNames = ["and", "not", ....]}

Note that this only happened after splitting apart the file; I previously had some code that consumed "lexer" in the same module, and then separated it out.

What type annotations do I need to provide in order for this to work?


Solution

  • The solution is to annotate: def :: LanguageDef st. That will fix s0 to String and m0 to Identity.

    Here's a... loose explanation.

    Let's pick through the types from the outside in. (That's not necessarily how inference works, but we usually have or can get the types of top-level bindings.) The inferred type for def is noted in the error message: GenLanguageDef s0 u m0. Looking at the definition of GenLanguageDef, the inferred type of identStart is therefore ParsecT s0 u m0 Char.

    letter has type Stream s m Char => ParsecT s u m Char. Unifying that with the type of identStart, we get the constraint Stream s0 m0 Char that needs to be satisfied somehow.

    The monomorphism restriction prohibits the compiler from simply floating the inferred constraint out to the type of def. With the restriction disabled, def has the inferred type Stream s0 m0 Char => GenLanguageDef s0 u m0. The consumer may fix the type variables, as you did in your previous one-file solution.

    Alternately, providing the concrete signature I suggested simply fixes the variables s0 and m0. Now the compiler can satisfy the class constraint directly, since it knows Identity is a Monad and that there's an instance Monad m => Stream String m Char.

    (You would think that, because emptyDef has type LanguageDef st, which translates to GenLanguageDef String st Identity, def would have that type. Ah, but you're using record update syntax, which allows the type variables to change.)