Search code examples
c++boostboost-spirit

Get current line in boost spirit grammar


I am trying to get the current line of the file I am parsing using boost spirit. I created a grammar class and my structures to parse my commands into. I would also like to keep track of which line the command was found on and parse that into my structures as well. I have wrapped my istream file iterator in a multi_pass iterator and then wrapped that in a boost::spirit::classic::position_iterator2. In my rules of my grammar how would I get the current position of the iterator or is this not possible?

Update: It is similar to that problem but I just need to be able to keep a count of all the lines processed. I don't need to do all of the extra buffering that was done in the solution.


Solution

  • Update: It is similar to that problem but I just need to be able to keep a count of all the lines processed. I don't need to do all of the extra buffering that was done in the solution.

    Keeping a count of all lines processed is not nearly the same as "getting the current line".

    Simple Take

    If this is what you need, just check it after the parse:

    Live On Wandbox

    #include <boost/spirit/include/qi.hpp>
    #include <boost/spirit/include/support_line_pos_iterator.hpp>
    #include <fstream>
    #include <set>
    namespace qi = boost::spirit::qi;
    
    int main() {
        using It = boost::spirit::istream_iterator;
    
        std::ifstream ifs("main.cpp");
        boost::spirit::line_pos_iterator<It> f(It(ifs >> std::noskipws)), l;
    
        std::set<std::string> words;
        if (qi::phrase_parse(f, l, *qi::lexeme[+qi::graph], qi::space, words)) {
            std::cout << "Parsed " << words.size() << " words";
            if (!words.empty())
                std::cout << " (from '" << *words.begin() << "' to '" << *words.rbegin() << "')";
            std::cout << "\nLast line processed: " << boost::spirit::get_line(f) << "\n";
        }
    }
    

    Prints

    Parsed 50 words (from '"' to '}')
    Last line processed: 22
    

    Slightly More Complex Take

    If you say "no, wait, I really did want to get the current line /while parsing/". The real full monty is here:

    Here's the completely trimmed down version using iter_pos:

    Live On Wandbox

    #include <boost/spirit/include/qi.hpp>
    #include <boost/spirit/include/phoenix.hpp>
    #include <boost/spirit/include/support_line_pos_iterator.hpp>
    #include <boost/spirit/repository/include/qi_iter_pos.hpp>
    #include <boost/fusion/adapted/std_pair.hpp>
    #include <fstream>
    #include <map>
    
    namespace qi = boost::spirit::qi;
    namespace qr = boost::spirit::repository::qi;
    
    using LineNum = size_t;
    
    struct line_number_f {
        template <typename It> LineNum operator()(It it) const { return get_line(it); }
    };
    
    static boost::phoenix::function<line_number_f> line_number_;
    
    int main() {
        using Underlying = boost::spirit::istream_iterator;
        using It = boost::spirit::line_pos_iterator<Underlying>;
        qi::rule<It, LineNum()> line_no = qr::iter_pos [ qi::_val = line_number_(qi::_1) ];
    
        std::ifstream ifs("main.cpp");
        It f(Underlying{ifs >> std::noskipws}), l;
    
        std::multimap<LineNum, std::string> words;
    
        if (qi::phrase_parse(f, l, +(line_no >> qi::lexeme[+qi::graph]), qi::space, words)) {
            std::cout << "Parsed " << words.size() << " words.\n";
    
            if (!words.empty()) {
                auto& first = *words.begin();
                std::cout << "First word: '" << first.second << "' (in line " << first.first << ")\n";
                auto& last = *words.rbegin();
                std::cout << "Last word: '" << last.second << "' (in line " << last.first << ")\n";
            }
    
            std::cout << "Line 20 contains:\n";
            auto p = words.equal_range(20);
            for (auto it = p.first; it != p.second; ++it)
                std::cout << " - '" << it->second << "'\n";
    
        }
    }
    

    Printing:

    Parsed 166 words.
    First word: '#include' (in line 1)
    Last word: '}' (in line 46)
    Line 20 contains:
     - 'int'
     - 'main()'
     - '{'