Search code examples
c++configwhitespaceistreamboost-program-options

How to handle spaces in Boost::program_options config files for custom option value types that are not strings?


This question concerns the parsing of values in a Boost::program_options configuration file.

I have a simple custom data structure:

struct Vector {
    double x, y, z;
};

I have an istream deserialiser for the format "(x, y, z)" that I borrowed from another SO post:

// https://codereview.stackexchange.com/a/93811/186081
struct Expect {
    char expected;
    Expect(char expected) : expected(expected) {}
    friend std::istream& operator>>(std::istream& is, Expect const& e) {
        char actual;
        if ((is >> actual) && (actual != e.expected)) {
            is.setstate(std::ios::failbit);
        }
        return is;
    }
};

template<typename CharT>
std::basic_istream<CharT> &
operator>>(std::basic_istream<CharT> &in, Vector &v) {
    in >> Expect('(') >> v.x
       >> Expect(',') >> v.y
       >> Expect(',') >> v.z
       >> Expect(')');
    return in;
}

I am using an instance of Vector as a value store for Boost::program_options:

Vector vector {0.0, 0.0, 0.0};
po::options_description opts("Usage");
opts.add_options()
      ("vector", po::value(&vector), "The vector");

po::variables_map vm;
po::store(po::parse_config_file("config.cfg", opts, true), vm);
po::notify(vm);

The problem is that the configuration file format does not work if the vector value representation contains spaces. For example, this config file parses correctly:

vector = (0.0,1.1,2.2)

However this, with spaces, does not parse:

vector = (0.0, 1.1, 2.2)

Instead, program_options throws:

the argument ('(0.0, 1.1, 2.2)') for option 'vector' is invalid

However, for options that are declared as std::string, spaces seem to be OK:

some_string = this is a string

I found a few posts that mentioned using quotes, however this doesn't seem to work (same error):

vector = "(0.0, 1.1, 2.2)"

A few other posts suggest custom parsers, however I'm not sure how I'd go about implementing this, and it seems like a lot of work just to handle a few spaces.

I assume this behaviour comes from the way command-line options are parsed, even though this is config-file parsing. In this case, a command line like --vector (0.0, 1.1, 2.2) would not make much sense (ignoring the use of shell-reserved characters ( & ) for now)

Is there a good way to handle this?


Solution

  • No you can't..

    Edit: After second thought I think you can try modify the delimiter as in https://en.cppreference.com/w/cpp/locale/ctype

    program_options make use of lexical_cast which requires the whole content is consumed after the operator>> . When there is space the content can never be consumed by one >>, by default and so the error.

    Hence you can do something like:

        struct Optional {
            char optional;
            Optional(char optional):optional(optional){}
            friend std::istream& operator>>(std::istream& is, Optional const& o) {
                char next;
                do{
                    next = is.peek();
                }while(next == o.optional && is.get());
                return is;
            }
        };
    
        struct vector_ctype : std::ctype<wchar_t> {
            bool do_is(mask m, char_type c) const {   
                if ((m & space) && c == L' ') {
                    return false; // space will NOT be classified as whitespace
                }
                return ctype::do_is(m, c); // leave the rest to the parent class
            } 
        };
    
    
        template<typename CharT>
        std::basic_istream<CharT> &
        operator>>(std::basic_istream<CharT> &in, Vector &v) {    
            std::locale default_locale = in.getloc();
            in.imbue(std::locale(default_locale, new vector_ctype()));
            in >> Expect('(') >> Optional(' ') >> v.x >> Optional(' ')
               >> Expect(',') >> Optional(' ') >> v.y >> Optional(' ')
               >> Expect(',') >> Optional(' ') >> v.z >> Optional(' ')
               >> Expect(')');
            in.imbue(default_locale);
            return in;
        }
    
    int main()
    {
        Vector v  = boost::lexical_cast<Vector>("(1,  2,  3)");
        std::cout << v.x <<"," <<v.y <<"," << v.z <<std::endl;
    }
    

    Output:

    1,2,3
    

    This should gives you correct output in program_options