I'm trying to use Spirit.Qi to parse a simple file format that has key value pairs separated with an equals sign. The file also supports comments and blank lines, as well as quoted values.
I can get nearly all of this to work as expected, however, any blank lines or comments cause an empty key value pair to be added to the map. When the map is traded for a vector, no blank entries are produced.
#include <fstream>
#include <iostream>
#include <string>
#include <map>
#include "boost/spirit/include/qi.hpp"
#include "boost/spirit/include/karma.hpp"
#include "boost/fusion/include/std_pair.hpp"
using namespace boost::spirit;
using namespace boost::spirit::qi;
////////////////////////////////////////////////////////////////////////////////
int main(int argc, char** argv)
{
std::ifstream ifs("file");
ifs >> std::noskipws;
std::map< std::string, std::string > vars;
auto value = as_string[*print];
auto quoted_value = as_string[lexeme['"' >> *(print-'"') >> '"']];
auto key = as_string[alpha >> *(alnum | char_('_'))];
auto kvp = key >> '=' >> (quoted_value | value);
phrase_parse(
istream_iterator(ifs),
istream_iterator(),
-kvp % eol,
('#' >> *(char_-eol)) | blank,
vars);
std::cout << "vars[" << vars.size() << "]:" << std::endl;
std::cout << karma::format(*(karma::string << " -> " << karma::string << karma::eol), vars);
return 0;
}
one=two
three=four
# Comment
five=six
vars[4]:
->
one -> two
three -> four
five -> six
Where is the empty key value pair coming from? And how can I prevent it from being generated?
Firstly, your program has undefined behaviour (and indeed it crashes on my system). The reason is you can't use auto
expressions to store stateful parser expressions.
See Assigning parsers to auto variables, boost spirit V2 qi bug associated with optimization level and others. See e.g. these answers for useful strategies to get around this limitation.
Secondly, the empty line is because of the grammar.
There's a difference between
(-kvp) % qi::eol
or
-(kvp % qi::eol)
The first will result in "optionally parsing a kvp" followed by "push the result into the attribute container".
The latter will optionally "parse 1 or more kvp into a container". Note that this won't push the empty value if it wasn't matched.
I suggest
key
and value
lexemes as well (just by dropping the Skipper on the rule declarations, really); You probably didn't want 'key name 1=value 1
to parse as "keyname1" -> "value1"
. You probably didn't want to allow key # no value\n
either.using namespace boost::spirit
. It's a bad idea. Trust me :/+eol
instead of eol
allows for the empty lines, which appears to be what you want#define BOOST_SPIRIT_DEBUG
#include "boost/spirit/include/qi.hpp"
#include "boost/spirit/include/karma.hpp"
#include "boost/fusion/include/std_pair.hpp"
#include <fstream>
#include <map>
namespace qi = boost::spirit::qi;
namespace karma = boost::spirit::karma;
template <typename It, typename Skipper, typename Data>
struct kvp_grammar : qi::grammar<It, Data(), Skipper> {
kvp_grammar() : kvp_grammar::base_type(start) {
using namespace qi;
value = raw [*print];
quoted_value = '"' >> *~char_('"') >> '"';
key = raw [ alpha >> *(alnum | '_') ];
kvp = key >> '=' >> (quoted_value | value);
start = -(kvp % +eol);
BOOST_SPIRIT_DEBUG_NODES((value)(quoted_value)(key)(kvp))
}
private:
using Pair = std::pair<std::string, std::string>;
qi::rule<It, std::string(), Skipper> value;
qi::rule<It, Pair(), Skipper> kvp;
qi::rule<It, Data(), Skipper> start;
// lexeme:
qi::rule<It, std::string()> quoted_value, key;
};
template <typename Map>
bool parse_vars(std::istream& is, Map& data) {
using It = boost::spirit::istream_iterator;
using Skipper = qi::rule<It>;
kvp_grammar<It, Skipper, Map> grammar;
It f(is >> std::noskipws), l;
Skipper skipper = ('#' >> *(qi::char_-qi::eol)) | qi::blank;
return qi::phrase_parse(f, l, grammar, skipper, data);
}
int main() {
std::ifstream ifs("input.txt");
std::map<std::string, std::string> vars;
if (parse_vars(ifs, vars)) {
std::cout << "vars[" << vars.size() << "]:" << std::endl;
std::cout << karma::format(*(karma::string << " -> " << karma::string << karma::eol), vars);
}
}
Output (currently broken on Coliru):
vars[3]:
five -> six
one -> two
three -> four
With debug info:
<kvp>
<try>one=two\nthree=four\n\n</try>
<key>
<try>one=two\nthree=four\n\n</try>
<success>=two\nthree=four\n\n# C</success>
<attributes>[[o, n, e]]</attributes>
</key>
<quoted_value>
<try>two\nthree=four\n\n# Co</try>
<fail/>
</quoted_value>
<value>
<try>two\nthree=four\n\n# Co</try>
<success>\nthree=four\n\n# Comme</success>
<attributes>[[t, w, o]]</attributes>
</value>
<success>\nthree=four\n\n# Comme</success>
<attributes>[[[o, n, e], [t, w, o]]]</attributes>
</kvp>
<kvp>
<try>three=four\n\n# Commen</try>
<key>
<try>three=four\n\n# Commen</try>
<success>=four\n\n# Comment\nfiv</success>
<attributes>[[t, h, r, e, e]]</attributes>
</key>
<quoted_value>
<try>four\n\n# Comment\nfive</try>
<fail/>
</quoted_value>
<value>
<try>four\n\n# Comment\nfive</try>
<success>\n\n# Comment\nfive=six</success>
<attributes>[[f, o, u, r]]</attributes>
</value>
<success>\n\n# Comment\nfive=six</success>
<attributes>[[[t, h, r, e, e], [f, o, u, r]]]</attributes>
</kvp>
<kvp>
<try>five=six\n</try>
<key>
<try>five=six\n</try>
<success>=six\n</success>
<attributes>[[f, i, v, e]]</attributes>
</key>
<quoted_value>
<try>six\n</try>
<fail/>
</quoted_value>
<value>
<try>six\n</try>
<success>\n</success>
<attributes>[[s, i, x]]</attributes>
</value>
<success>\n</success>
<attributes>[[[f, i, v, e], [s, i, x]]]</attributes>
</kvp>
<kvp>
<try></try>
<key>
<try></try>
<fail/>
</key>
<fail/>
</kvp>