I have the following rule:
rule<std::string::const_iterator, std::string()> t_ffind, t_sim, t_hash, t_state;
t_ffind = hold[(attr('$') >> t_sim >> t_hash >> t_state)] | t_sim;
which means that I could find t_sim
alone or followed by t_hash
and t_state
, if it is alone t_ffind
will take the exact value of t_sim
, in the other case I will also insert a marking character at the beginning of the string.
but if I write the rule like that I would be parsing t_sim
twice, so I modified the rule to:
t_ffind = t_sim >> -(qi::hold[t_hash >> t_state]);
but remains the problem of inserting the character if (t_hash >> t_state)
is present, I think the solution could be some semantic action at the end:
t_ffind = t_sim >> -(qi::hold[t_hash >> t_state])[];
but I can't find how to do that, also if there's other solution that doesn't involve semantic action would be even better.
I'd say the idea of "adding a magic character to some unrelated attribute" constitutes a questionable design choice. In general, I recommend to keep parsing and program logic separate. So I'd parse into
namespace ast {
struct t_ffind {
std::string t_sim;
boost::optional<std::string> t_hash, t_state; // or whatever the types are
};
}
Or, if you really don't have a reason to model the hash/state tokens into separate fields, you could do
namespace ast {
struct t_ffind {
std::string t_sim_hash_state;
bool sim_only;
};
}
but it would get more complicated to set sim_only
from within a semantic action. This is getting close to the issue you are facing.
Just for fun, let's see what we could do. Firstly, optimizing the repeated parsing of t_sim
smells like a premature optimization. But perhaps you could use a semantic action to alter _val
:
t_ffind %= t_sim >> -(as_string[t_hash >> t_state] [ insert(_val, begin(_val), '$') ]);
Note the use of as_string[]
to glue the attributes of t_hash and t_state together, so the automatic attribute propagation keeps working. I strongly suspect this to be an - obviously - bigger performance hit than potentially parsing t_sim
twice.
You can try to wrangle more control from Spirit:
t_ffind = (t_sim >> -(as_string[t_hash >> t_state]))
[ if_(_2) [ _val = '$' + _1 + *_2 ].else_ [ _val = _1 ] ];
Still using the as_string
intermediate concatenation. You can forgo it:
t_ffind = (t_sim >> -(t_hash >> t_state))
[ if_(_2)
[ _val = '$' + _1 + at_c<0>(*_2) + at_c<1>(*_2) ]
.else_
[ _val = _1 ]
];
By now, we're getting ridiculously far adrift for very little gain (if any). I'd suggest either
writing it the naive way:
t_ffind = hold[(attr('$') >> t_sim >> t_hash >> t_state)] | t_sim;
fixing your AST to mirror the thing you're parsing
writing the parser manually
All the above variations:
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/include/phoenix_fusion.hpp>
int main() {
using namespace boost::spirit::qi;
rule<std::string::const_iterator, std::string()>
t_sim = "sim",
t_hash = +digit,
t_state = raw[lit("on")|"off"],
t_ffind;
for (auto initialize_t_ffind : std::vector<std::function<void()> > {
[&] { t_ffind = hold[(attr('$') >> t_sim >> t_hash >> t_state)] | t_sim; },
[&] {
// this works:
using boost::phoenix::insert;
using boost::phoenix::begin;
t_ffind %= t_sim >> -(as_string[t_hash >> t_state] [ insert(_val, begin(_val), '$') ]);
},
[&] {
// this works too:
using boost::phoenix::if_;
t_ffind = (t_sim >> -(as_string[t_hash >> t_state]))
[ if_(_2)
[ _val = '$' + _1 + *_2 ]
.else_
[ _val = _1 ]
];
},
[&] {
// "total control":
using boost::phoenix::if_;
using boost::phoenix::at_c;
t_ffind = (t_sim >> -(t_hash >> t_state))
[ if_(_2)
[ _val = '$' + _1 + at_c<0>(*_2) + at_c<1>(*_2) ]
.else_
[ _val = _1 ]
];
} })
{
initialize_t_ffind();
for (std::string const s : { "sim78off", "sim" })
{
auto f = s.begin(), l = s.end();
std::string result;
if (parse(f, l, t_ffind, result)) {
std::cout << "Parsed: '" << result << "'\n";
} else {
std::cout << "Parse failed\n";
}
if (f != l) {
std::cout << "Remaining input: '" << std::string(f,l) << "'\n";
}
}
}
}
Prints:
Parsed: '$sim78off'
Parsed: 'sim'
Parsed: '$sim78off'
Parsed: 'sim'
Parsed: '$sim78off'
Parsed: 'sim'
Parsed: '$sim78off'
Parsed: 'sim'