I have this parser:
class Parser
%%{
machine test_lexer;
action s { s = p; puts "s#{p}" }
action e { e = p; puts "e#{p}" }
action captured {
puts "captured #{s} #{e}"
}
key_value = "a" %s ("b" | "x" "c")+ %e %captured;
tags = ("x"+)? key_value;
main := tags*;
}%%
def initialize(data)
data = data
eof = data.length
%% write data;
%% write init;
%% write exec;
end
end
Parser.new(ARGV.first)
And I hit it with abxc then why does it call the captured twice / the e twice, and how can I prevent this ?
ragel -R simple.rl && ruby simple.rb "abxc"
s1
e2
captured 1 2
e4
captured 1 4
on github: https://github.com/grosser/ragel_example
Here is the diagram for your machine, BTW: http://bit.do/stackoverflow-19621544 (created with Erdos).
With "abxc" the ("b" | "x" "c")+
machine first matches the "b" and then the "xc". When transitioning from "b" (to "x") it calls the leaving actions (e
and captured
) for the first time, and when transitioning from "xc" (to EOF) it calls the leaving actions (e
and captured
) for the second time.
I guess the e
action is supposed to set the end pointer in order to capture the string between start s
and end e
. If so, then Ragel calling the e
action multiple times isn't really a problem, you just advance the end pointer like you already do.