Search code examples
c++parsingboostboost-spirit-qiwavefront

OBJ Parser with Boost Spirit - Ignoring comments


I'm trying to write a basic OBJ file loader using the Boost Spirit library. Although I got it working using the standard std::ifstreams, I'm wondering if it's possible to do a phrase_parse on the entire file using a memory mapped file, since it seems to provide the best performance as posted here.

I have the following code, which seems to work well, but it breaks when there is a comment in the file. So, my question is how do you ignore a comment that starts with a '#' in the OBJ file using Spririt?

struct vertex {
    double x, y, z;
};

BOOST_FUSION_ADAPT_STRUCT(
                          vertex,
                          (double, x)
                          (double, y)
                          (double, z)
                          )
std::vector<vertex> b_vertices         
boost::iostreams::mapped_file mmap(
                                           path,
                                           boost::iostreams::mapped_file::readonly);
        const char* f = mmap.const_data();
        const char* l = f + mmap.size();


        using namespace boost::spirit::qi;

      bool ok = phrase_parse(f,l,(("v" >> double_ >> double_ >> double_) |
                               ("vn" >> double_ >> double_>> double_)) % eol ,
                               blank, b_vertices);

The above code works well when there are no comments or any other data except vertices/normals. But when there is a different type of data the parser fails (as it should) and I'm wondering if there is a way to make it work without going back to parsing every line as it is slower (almost 2.5x in my tests). Thank you!


Solution

  • The simplest way that comes to mind is to simply make comments skippable:

    bool ok = qi::phrase_parse(
            f,l,
             (
                   ("v"  >> qi::double_ >> qi::double_ >> qi::double_) |
                   ("vn" >> qi::double_ >> qi::double_ >> qi::double_)
              ) 
              % qi::eol,
            ('#' >> *(qi::char_ - qi::eol) >> qi::eol | qi::blank), b_vertices);
    

    Note that this also 'recognizes' comments if # appears somewhere inside the line. This is probably just fine (as it would make the parsing fail, unless it was a comment trailing on an otherwise valid input line).

    See it Live on Coliru

    Alternatively, use some phoenix magic to handle "comment lines" just as you handle a "vn" or "v" line.