I'm quite puzzled with parsing strings when I have to define in rule the minus and it is just a minus character and not a range of characters between two endpoints.
For example, when you write a rule to percent encode a string of characters you normally would write
*(bk::char_("a-zA-Z0-9-_.~") | '%' << bk::right_align(2, 0)[bk::upper[bk::hex]]);
Which normally means "letters, capital letters, digits, minus sign, underscore, dot and tilde", but the third minus sign would create a range between 9 and underscore or something, so you have to put the minus at the end bk::char_("a-zA-Z0-9_.~-")
.
It solves current problem but what would one do when the input is dynamic, like user input, and minus sign just means minus character?
How do I prevent from Spirit assign a special meaning to any of possible characters?
EDIT001:
I resort to more concrete example from @sehe answer
void spirit_direct(std::vector<std::string>& result, const std::string& input, char const* delimiter)
{
result.clear();
using namespace bsq;
if(!parse(input.begin(), input.end(), raw[*(char_ - char_(delimiter))] % char_(delimiter), result))
result.push_back(input);
}
in case you want to ensure the minus is treated as minus and not a range one would to alter the code as following (according to @sehe proposal below).
void spirit_direct(std::vector<std::string>& result, const std::string&
input, char const* delimiter)
{
result.clear();
bsq::symbols<char, bsq::unused_type> sym_;
std::string separators = delimiter;
for(auto ch : separators)
{
sym_.add(std::string(1, ch));
}
using namespace bsq;
if(!parse(input.begin(), input.end(), raw[*(char_ - sym_)] % sym_, result))
result.push_back(input);
}
Which looks quite elegant. In case of using static constant rule I guess I can escape characters with '\', square brackets were meant as one of those "special" characters which need to be escaped. Why? what is the meaning of []? Is there any additional characters to escape?
Simple.
You devise and specify the supported patterns that the user can supply with their meanings.
Next,
you write the code that transforms it into a character-set (e.g. expand all ranges (if supported in user input) and sort the -
to be the first character by definition).
do not use a character set at all.
char_ [ _pass = my_match_predicate(_1) ]
lit('a') | 'b' | '-' | '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
why not use qi::symbols<char, char>
(or even qi::symbols<char, qi::unused_type> sym_;
with raw [ sym_ ]
or similar)
Update The
qi::symbols<>
approach is surprisingly fast: Live On Coliru. I had a recent optimization job where it disappointed: see this answer (under "Spirit (Trie)") – Binary String to Hex c++
In general, I don't know what you're trying to achieve, but Spirit is not well-suited for generating rules on the fly. See some of my existing boost-spirit answers on this site.