Search code examples
c++boost-spirit-qi

Segmentation fault with boost spirit parser for one digit


I'm trying to use Boost.Spirit. When I test a very simple parser, which must parse only one digit, the program crash.

#include <boost/spirit/include/qi.hpp>
#include <iostream>
#include <string>

namespace qi = boost::spirit::qi;

auto const noneZero = qi::char_('1') |
                      qi::char_('2') |
                      qi::char_('3');

int main(int argc, char** argv)
{
    std::string input = "9";
    std::string output;

    if (qi::parse(input.begin(), input.end(), noneZero, output))
    {
        std::cout << "Ok => '" << output << "'\n";
    }
    else
    {
        std::cout << "No\n";
    }
    return 0;
}

What I'm doing wrong? It should be a very simple case, I can't figure out where I'm doing something wrong.

Stranger, if I write the following code all work fine... But why?! The grammars should be the same, no?

#include <boost/spirit/include/qi.hpp>
#include <iostream>
#include <string>

namespace qi = boost::spirit::qi;

auto const noneZero = qi::char_('1', '9');

int main(int argc, char** argv)
{
    std::string input = "9";
    std::string output;

    if (qi::parse(input.begin(), input.end(), noneZero, output))
    {
        std::cout << "Ok => '" << output << "'\n";
    }
    else
    {
        std::cout << "No\n";
    }
    return 0;
}

Even more interesting, the following program does not crash:

#include <boost/spirit/include/qi.hpp>
#include <iostream>
#include <string>

namespace qi = boost::spirit::qi;

auto const noneZero = qi::char_('1') |
                      qi::char_('2');

int main(int argc, char** argv)
{
    std::string input = "9";
    std::string output;

    if (qi::parse(input.begin(), input.end(), noneZero, output))
    {
        std::cout << "Ok => '" << output << "'\n";
    }
    else
    {
        std::cout << "No\n";
    }
    return 0;
}

Someone could explain me why this grammar does not crash:

auto const noneZero = qi::char_('1') |
                      qi::char_('2');

And why this grammar crash:

auto const noneZero = qi::char_('1') |
                      qi::char_('2') |
                      qi::char_('3');

Suspecting something wrong on my own computer, I've tried all theses examples on coliru, with the same results. All theses examples has been compiled with the following command:

clang++ test.cpp -Wall -Werror -Wextra --std=c++14

Solution

  • You're using auto, without deepcopying the Proto expression trees. This creates dangling references, and hence Undefined Behaviour.

    Here's a fix (note the shorter way to write it, too):

    auto const nonZero = qi::copy(qi::char_("1-3"));
    

    You could also just write

    auto const nonZero = qi::copy(qi::digit - '0');
    

    All the other samples "working" is still Undefined Behaviour (UB). Anything can happen if you go outside the lines.

    Live Demo

    Live On Coliru

    #include <boost/spirit/include/qi.hpp>
    #include <iostream>
    #include <string>
    
    namespace qi = boost::spirit::qi;
    
    auto const nonZero = qi::copy(qi::char_("1-9"));
    
    int main(int argc, char** argv)
    {
        std::string input = "9";
        std::string output;
    
        if (qi::parse(input.begin(), input.end(), nonZero, output))
        {
            std::cout << "Ok => '" << output << "'\n";
        }
        else
        {
            std::cout << "No\n";
        }
        return 0;
    }
    

    Prints

    Ok => '9'