Search code examples
parsinghaskellhappy

Data type mismatch with Haskell parser generator - Happy


I have been dealing with this problem for a couple days and I'm out of ideas, hopefully you can help me:

My token list is the following:

%Token
var       {TokenVariableDeclaration}
varId     {TokenVar $$} -- Strings like "x", "n" or "m"
int       {TokenInt $$}

My grammar rule is as follows:

VariablesList : var varId ';'                   {VariablesList [($2,ArithmeticInt 0)]}
              | var varId ',' VariablesList     {VariablesList (($2,ArithmeticInt 0):$4)}

ArithmeticExpression : int {ArithmeticInt $1}

It just defines a list of variables like any you might find in an imperative programming language (in this given language, variables can only be assigned integers):

var n,m,x;

And my lexer (Haskell portion of the file) has the following data type:

data VariablesList = VariablesList [(String,ArithmeticExpression)] deriving (Show, Eq)
data ArithExpression = ArithInt Int deriving (Show, Eq)

So that, after parsing, I can get a list of all the variables declared, initizalized with the data "ArithmeticInt 0":

VariablesList [("n",ArithmeticInt 0),("m",ArithmeticInt 0),("x",ArithmeticInt 0)]

When I run the 'happy' command on my prompt everything is fine:

C:> happy "myParser.y"

But when I load the resulting .hs file on my GHCI:

Prelude> :l "myParser.hs"

I get an extensive error saying that the type VariablesList cannot be matched to the type [(String,ArithmeticExpression)]. I know, due to different tests I've made, that the issue is on the second pattern of my VariablesList grammar rule:

VariablesList : var varId ';'                   {VariablesList [($2,ArithmeticInt 0)]}
              | var varId ',' VariablesList     {VariablesList (($2,ArithmeticInt 0):$4)}

Sprecifically the ($2,ArithmeticInt 0):$4 portion. I'm pretty new to Haskell and what I can understand is that the fourth argument ($4) is of type VariablesList and a type (String,ArithmeticExpression) cannot be concatenated (:) to it.

Any kind of help or guidance will be very much welcomed :) .

EDIT: By petition, here is a minimal working Happy file:

{  
module HappyLambdaSyntax4 where  
import Data.Char
import System.IO
}

%name parse VariablesList
%tokentype {Token}
%error {parseError}

%token
var       {TokenVariableDeclaration}
varId     {TokenVar $$} -- Strings like "x", "n" or "m"
int       {TokenInt $$}
';'       {TokenPuntoYComa}
','       {TokenComa}

%%

VariablesList : var varId ';'                         {VariablesList [($2,ArithmeticInt 0)]} -- var n;
              | var varId ',' varId ';'               {VariablesList (($2,ArithmeticInt 0):[($4,ArithmeticInt 0)])} --var n,m;
              | var varId ',' varId ',' varId ';'     {VariablesList (($2,ArithmeticInt 0):[($4,ArithmeticInt 0),($6,ArithmeticInt 0)])} --var n,m,x;
              -- var varId ',' VariablesList          {VariablesList (($2,ArithmeticInt):$4)} Ideal solution. Recursive. Does not work.

ArithmeticExpression : int {ArithmeticInt $1}

{
parseError :: [Token] -> a
parseError _ = error ("Parse error.")

data ArithmeticExpression = ArithmeticInt Int deriving (Show, Eq)
data VariablesList = VariablesList [(String,ArithmeticExpression)] deriving (Show, Eq)

data Token = TokenVariableDeclaration
           | TokenVar String
           | TokenInt Int
           | TokenPuntoYComa
           | TokenComa
           deriving (Show, Eq)

lexer :: String -> [Token]
lexer [] = []
lexer (c:cs)
    | isSpace c = lexer cs
    | isDigit c = lexNum (c:cs)
    | isAlpha c = lexVar (c:cs)
    | c == ';' = TokenPuntoYComa : (lexer cs)
    | c == ',' = TokenComa : (lexer cs)
    | otherwise = error ("Lexer error.")

lexNum cs = TokenInt (read num) : lexer rest
    where (num,rest) = span isDigit cs

lexVar cs =
    case span isAlpha cs of
    ("var",rest) -> TokenVariableDeclaration : lexer rest
    (var,rest) -> TokenVar var : lexer rest
}

Run with:

>happy "file.y"

Then, in GHCI, load:

Prelude> :l file.hs

Finally, to test it:

Prelude> parse (lexer "var n,m,x;")

Or any list with less than 3 variables.


Solution

  • First of all: when you launch happy it generates an Haskell file, but doesn't compile it. So happy does not check whether the haskell code you inserted is valid. That's done afterwards when you compile the file. The behaviour you see is expected.

    Now the problem is that your rule is:

    var varId ',' VariablesList     {VariablesList (($2,ArithmeticInt 0):$4)}
    

    Where $4 refers to a VariablesList but : has type a -> [a] -> [a] not (String, ArithmeticExpression) -> VariablesList -> VariablesList. The $4 does not refer to the list contained inside the VariablesList.

    What you need is a way to concatenate VariablesLists, for example:

    x <:> (VariablesList xs) = VariablesList (x:xs)
    

    and use the rule:

    var varId ',' VariablesList     {($2,ArithmeticInt 0) <:> $4}
    

    A suggestion: happy allows you to define parametrized productions. Usually lists are better handled with such rules.

    For example you could define a production that represent a list with a separator:

    rev_list_sep(p, sep) : p                           {[$1]}
                         | rev_list_sep(p, sep) sep p  {$3 : $1}
    

    and use it as:

    VarDecl : var varId
    VariablesList : rev_list_sep(VarDecl, ',') ';'   {VariablesList (reverse $1)}
    

    (not tested, just to give an idea).

    Note that you can reuse such a production to define other lists:

    SomeOtherList : rev_list_sep(SomethingElse, ';')  {Whatever (reverse $1)}