I am using Scala's Parser Combinator framework, extending the RegexParsers
class. I have an identifier
token which starts with a letter and can contain alphabet characters, dashes, underscores and digits, as long as it is not one of the reserved words. I tried to use the parser's not()
to function to stop reserved words from being used, however it is also matching identifiers which are prefixed with a reserved word.
def reserved = "and" | "or"
def identifier: Parser[String] = not(reserved) ~> """[a-zA-Z][\.a-zA-Z0-9_-]*""".r
However, when I tried to parse an identifier like and-today
I get an error saying Expected Failure
.
How do I only filter reserved words if they are a full match of the token and not just a prefix?
Also is there a way to improve the error reporting in this case when using not()
? In other cases I get the regular expression that the parser is expecting, but in this case it just says Failure
without any details.
You can use filterWithError
both to filter out the reserved words and to customize the error message like this:
val reservedWords = HashSet("and", "or")
val idRegex= """[a-zA-Z][\.a-zA-Z0-9_-]*""".r
val identifier = Parser(input =>
idRegex(input).filterWithError(
!reservedWords.contains(_),
reservedWord => s"YOUR ERROR MESSAGE FOR $reservedWord",
input
)
)