For the following treetop grammer, when parsing '3/14/01' (via t = Parser.parse('3/14/01')
in irb), I get a "TypeError: wrong argument type Class (expected Module)".
grammar SimpleDate
rule dateMDY
whitespace? month_part ( '/' / '-') day_part ( ('/' / '-') year_part)? whitespace? <DateMDY>
end
rule month_part
( ( '1' [0-2] ) / ( '0'? [1-9] ) ) <MonthLiteral>
end
rule day_part
( ( [12] [0-9] ) / ( '3' [0-1] ) / ( '0'? [1-9] ) ) <DayLiteral>
end
rule year_part
( ( '1' '9' ) / ( '2' [01] ) )? [0-9] [0-9] <YearLiteral> # 1900 through 2199 (4 digit)
end
rule whitespace
[\s]+
end
end
First,
if I comment out the <MonthLiteral>
and the <DayLiteral>
class references, all is well. Commenting out <DateMDY>
, but leaving those Literal objects in, will also issue the error. Commenting out <YearLiteral>
does not seem to matter (it'll work or not work regardless) -- that seems to indicate that because the first two are non-terminal, I can't produce elements for them.
There is clearly something I'm not appreciating about how Ruby (or treetop) is instantiating these classes or about AST generation that would explain what happens. Can you explain or point me to something that can help me understand why <MonthLiteral>
or <DayLiteral>
can't have objects generated?
Second,
this may be a bridge too far, but what I'd really prefer would be to get a DateMDY
object with three attributes -- month, day, and year -- so I can readily produce a Ruby Time
object from a method to_time
in DateMDY
, but right now I'd settle for just producing the constituent pieces as objects.
So I tried leaving <DateMDY>
as the object and commented out the references to <MonthLiteral>
, <DayLiteral>
, and <YearLiteral>
. I saw that the resulting AST object returned from .parse
(t
in my original example) has two public methods -- :day_part
and :month_part
but those seem to be nil when I invoke those (say, puts t.day_part
) and there is no :year_part
method, so that didn't seem to help me out.
Is it possible to do somehow have DateMDY
end up accessing its constituent parts?
FYI, the Parser
code itself I'm using is pretty standard stuff from the treetop tutorials and the node_extensions.rb
that defines the object classes is also trivial, but I can post those too if you need to see those.
Thanks! Richard
The error message is telling you exactly what you did wrong. There's only a restricted set of places where you can use a Class this way. When it's allowed, the Class must be a subclass of SyntaxNode. Normally however you should use a Module, which is extend()ed into the SyntaxNode that has been created by an inner rule. The difference in the case of YearLiteral is it does not wrap a parenthesised sequence the way Month and Day literal do. This parenthesised sequence returns an existing SyntaxNode, which cannot be extend()ed with another Class, only with a Module, so you get the TypeError.
As for your second question, the DateMDY object you want should almost certainly not be a SyntaxNode - since all SyntaxNodes retain references to all their child SyntaxNodes and to the input string - this is the parser internals we're talking about. Do you really want to expose bits of the parser internals to the outside world?
Instead, you should arrange for the appropriate syntax node to be visited after the parse has completed, by calling a function that returns your domain object type constructed using the substrings identified and saved by these parser objects. It's best to add these functions to traverse down from your topmost rule, rather than trying to traverse the parse tree "from the outside".
You can do this by adding a block into your top rule, like this (assuming you have an appropriate DateMDY class). When you have a successful parse tree, get your DateMDY by calling "tree.result":
rule dateMDY
whitespace? month_part ( '/' / '-') day_part y:( ('/' / '-') year_part)? whitespace?
{
def result
DateMDY.new(y.empty? ? nil : y.year_part.text_value.to_i,
month_part.text_value.to_i,
day_part.text_value.to_i)
end
}
end
Of course, it's cleaner to add separate result methods for year_part, month_part and day_part; this is just an intro to how to add these methods.