I am learning haskell with Write yourself a scheme
.
I am currently trying to implement a char
recognition in scheme. A char is #\<character>
or #\<character-name>
like #\a
or #\
or #\space
.
So i wrote the following code :
-- .. some code ..
data LispVal = Atom String
| List [LispVal]
| DottedList [LispVal] LispVal
| String String
| Number Integer
| Bool Bool
| Char Char deriving Show
-- .... More code ...
parseChar :: Parser LispVal
parseChar = liftM Char (parseSingleChar <|> parseSpecialCharNotation)
parseSingleChar :: Parser Char
parseSingleChar = do string "#\\"
x <- letter
return x
parseSpecialCharNotation :: Parser Char
parseSpecialCharNotation = do string "#\\"
x <- (parseSpace <|> parseNewline)
return x
parseSpace :: Parser Char
parseSpace = do char 's'
char 'p'
char 'a'
char 'c'
char 'e'
return ' '
parseNewline :: Parser Char
parseNewline = do char 'n'
char 'e'
char 'w'
char 'l'
char 'i'
char 'n'
char 'e'
return '\n'
-- .. some more code...
readExpr :: String -> String
readExpr input = case parse parseExpr "lisp" input of
Left err -> "Parse Error: " ++ show err
Right val -> "Found value: " ++ show val
At this moment, i did not know about the string
parser in Parsec
.
The problem is that i recognizes, #\a
but #\space
is treated as a s
.
*Main> readExpr "#\\space"
"Found value: Char 's'"
To resolve this problem, i changed parseChar
as
parseChar :: Parser LispVal
parseChar = liftM Char (parseSpecialCharNotation <|> parseSingleChar)
but earlier problem is solved, but now it gives me errors with normal characters as -
*Main> readExpr "#\\s"
"Parse Error: \"lisp\" (line 1, column 4):\nunexpected end of input\nexpecting \"p\""
Why is that happening ? Should not it had moved to parseSingleChar
as parseSpecialCharNotation
failed ?
Full code at: Gist
From the documentation for <|>
:
The parser is called predictive since q is only tried when parser p didn't consume any input (i.e.. the look ahead is 1).
In your case both the parses consume "#\\"
before failing, so the other alternative can't be evaluated. You can use try
to ensure backtracking works as expected:
The parser
try p
behaves like parserp
, except that it pretends that it hasn't consumed any input when an error occurs.
Something like the next:
try parseSpecialCharNotation <|> parseSingleChar
Side note: is it better to extract "#\\"
out of the parsers because otherwise you are doing the same work twice. Something like the next:
do
string "#\\"
try parseSpecialCharNotation <|> parseSingleChar
Also, you can use string
combinator instead of a series of char
parsers.