I generated a simple code using qi::spirit:
#include <boost/spirit/include/qi.hpp>
#include <string>
using namespace std;
using namespace boost::spirit;
int main() {
string str = "string";
auto begin = str.begin();
auto symbols = (qi::lit(";") | qi::lit("(") | qi::lit(")") | qi::lit("+") |
qi::lit("/") | qi::lit("-") | qi::lit("*"));
qi::parse(begin, str.end(), *(qi::char_ - symbols));
}
And then this program was terminated by SEGV.Then, my rewritten code with less alternative operators in rhs of symbols,
#include <boost/spirit/include/qi.hpp>
#include <string>
using namespace std;
using namespace boost::spirit;
int main()
{
string str = "string";
auto begin = str.begin();
auto symbols = (qi::lit(";") | qi::lit("+") | qi::lit("/") | qi::lit("-") |
qi::lit("*"));
qi::parse(begin, str.end(), *(qi::char_ - symbols));
}
now works well. What's the difference between 2 cases?
Your problem is a classic mistake: using auto
to store Qi parser expressions: Assigning parsers to auto variables
That leads to UB.
Use a rule, or qi::copy
(which is proto::deep_copy
under the hooed).
auto symbols = qi::copy(qi::lit(";") | qi::lit("(") | qi::lit(")") | qi::lit("+") |
qi::lit("/") | qi::lit("-") | qi::lit("*"));
Even better, use a character set to match all the characters at once,
auto symbols = qi::copy(qi::omit(qi::char_(";()+/*-")));
The omit[]
counteracts the fact that char_
exposes it's attribute (where lit
doesn't). But since all you ever you it for is to SUBTRACT from another character-set:
qi::char_ - symbols
You could just as well just write
qi::char_ - qi::char_(";()+/*-")
Now. You might not know, but you can use ~charset
to negate it, so it would just become
~qi::char_(";()+/*-")
NOTE
-
can have special meaning in charsets, which is why I very subtly move it to the end. See docs
Extending a little and showing some subtler patterns:
#include <boost/spirit/include/qi.hpp>
#include <iomanip>
#include <string>
using namespace std;
using namespace boost::spirit;
int main() {
string const str = "string;some(thing) else + http://me@host:*/path-element.php";
auto cs = ";()+/*-";
using qi::char_;
{
std::vector<std::string> tokens;
qi::parse(str.begin(), str.end(), +~char_(cs) % +char_(cs), tokens);
std::cout << "Condensing: ";
for (auto& tok : tokens) {
std::cout << " " << std::quoted(tok);
}
std::cout << std::endl;
}
{
std::vector<std::string> tokens;
qi::parse(str.begin(), str.end(), *~char_(cs) % char_(cs), tokens);
std::cout << "Not condensing: ";
for (auto& tok : tokens) {
std::cout << " " << std::quoted(tok);
}
std::cout << std::endl;
}
}
Prints
Condensing: "string" "some" "thing" " else " " http:" "me@host:" "path" "element.php"
Not condensing: "string" "some" "thing" " else " " http:" "" "me@host:" "" "path" "element.php"
If you have c++14, you can use Spirit X3, which doesn't have the "auto problem" (because it doesn't have Proto Expression trees that can get dangling references).
Your original code would have been fine in X3, and it will compile a lot faster.
Here's my example using X3:
#include <boost/spirit/home/x3.hpp>
#include <iostream>
#include <iomanip>
#include <string>
namespace x3 = boost::spirit::x3;
int main() {
std::string const str = "string;some(thing) else + http://me@host:*/path-element.php";
auto const cs = x3::char_(";()+/*-");
std::vector<std::string> tokens;
x3::parse(str.begin(), str.end(), +~cs % +cs, tokens);
//x3::parse(str.begin(), str.end(), *~cs % cs, tokens);
for (auto& tok : tokens) {
std::cout << " " << std::quoted(tok);
}
}
Printing
"string" "some" "thing" " else " " http:" "me@host:" "path" "element.php"