Search code examples
boostboost-spiriteager

Boost Spirit x3: parse delimited string


I'm writing yet another boring calculator parser with Spirit X3 and I've come to a problem: I've defined 2 literals, "cos" and "cosh", each of which expect to be followed by a number. The rules I've written are:

const std::string COS_TAG = "cos";
const std::string COSH_TAG = "cosh";
const auto cos = (COS_TAG > value)[math::cos_solver{}];
const auto cosh = (COSH_TAG > value)[math::cosh_solver{}];

(I know semantic actions aren't the preferred way, but I'm lazy). Now, the problem when parsing "cosh 3.5" is:

expectation failure: expecting value here "cosh 3.5"
----------------------------------------------^-----

Looks like the parser is eager and consumes the first tag without checking for the other. I've made it work by using the difference operator like this:

const std::string COS_TAG = "cos";
const std::string COSH_TAG = "cosh";
const auto cos = ((x3::lit(COS_TAG) - COSH_TAG) > value)[math::cos_solver{}];
const auto cosh = (COSH_TAG > value)[math::cosh_solver{}];

Is there a better approach?


Solution

  • So, not delimited strings, rather check for token boundaries.

    Qi repository had distinct for this and I think I may have glanced upon that in the X3 code base once. Will look for it.

    Regardless, for now/older versions/your understanding, here are some ways to achieve it:

    • re-order the branches, if you match cosh before cos you get the behaviour you want because of the greedyness

    • make a more general assertion about your identifier:

      auto kw = [](auto p) {
           return x3::lexeme [ x3::as_parser(p) >> !x3::char_("a-zA-Z0-9_") ];
      };
      

      Now instead of lit(COSH) you can use kw(COSH) and be sure it wouldn't match coshida.