Search code examples
parsingboost-spiritqi

Spirit: Allowing a character at the begining but not in the middle


I'm triying to write a parser for javascript identifiers so far this is what I have:

// All this rules have string as attribute.
identifier_ = identifier_start
    >> 
    *(
        identifier_part >> -(qi::char_(".") > identifier_part)                      
     )
;
identifier_part = +(qi::alnum | qi::char_("_"));
identifier_start = qi::char_("a-zA-Z$_");

This parser work fine for the list of "good identifiers" in my tests:

"x__",
"__xyz",
"_",
"$",
"foo4_.bar_3",
"$foo.bar",
"$foo",
"_foo_bar.foo",
"_foo____bar.foo"

but I'm having trouble with one of the bad identifiers: foo$bar. This is supposed to fail, but it success!! And the sintetized attribute has the value "foo".

Here is the debug ouput for foo$bar:

<identifier_>
    <try>foo$bar</try>
    <identifier_start>
        <try>foo$bar</try>
        <success>oo$bar</success>
        <attributes>[[f]]</attributes>
    </identifier_start>
    <identifier_part>
        <try>oo$bar</try>
        <success>$bar</success>
        <attributes>[[f, o, o]]</attributes>
    </identifier_part>
    <identifier_part>
        <try>$bar</try>
        <fail/>
    </identifier_part>
  <success>$bar</success>
  <attributes>[[f, o, o]]</attributes>
</identifier_>

What I want is to the parser fails when parsing foo$bar but not when parsing $foobar.

What I'm missing?


Solution

  • You don't require that the parser needs to consume all input.

    When a rule stops matching before the $ sign, it returns with success, because nothing says it can't be followed by a $ sign. So, you would like to assert that it isn't followed by a character that could be part of an identifier:

    identifier_ = identifier_start
        >> 
        *(
            identifier_part >> -(qi::char_(".") > identifier_part)                      
         ) >> !identifier_start
    ;
    

    A related directive is distinct from the Qi repository: http://www.boost.org/doc/libs/1_55_0/libs/spirit/repository/doc/html/spirit_repository/qi_components/directives/distinct.html