Search code examples
c++istringstreamformatted-input

How to extract mixed format using istringstream


Why does my program not output:

10
1.546
,Apple 1

instead of

10
1
<empty space>

here's my program:

#include <iostream>
#include <string>
#include <sstream>

using namespace std;

int main () {
    string str = "10,1.546,Apple 1";
    istringstream stream (str);
    int a;
    double b;
    string c, dummy;
    stream >> a >> dummy >> b >> dummy >> c;
    cout << a << endl;
    cout << b << endl;
    cout << c << endl;
    return 0;
}

Basically I am trying to parse the comma-separated strings, any smoother way to do this would help me immense.


Solution

  • In IOStreams, strings (meaning both C-strings and C++ strings) have virtually no formatting requirements. Any and all characters are extracted into a string only until a whitespace character is found, or until the end of the stream is caught. In your example, you're using a string intended to eat up the commas between the important data, but the output you are experiencing is the result of the behavior I just explained: The dummy string doesn't just eat the comma, but also the rest of the character sequence until the next whitespace character.

    To avoid this you can use a char for the dummy variable, which only has space for one character. And if you're looking to put Apple 1 into a string you will need an unformatted extraction because the formatted extractor operator>>() only reads until whitespace. The appropriate function to use here is std::getline():

    string c;
    char dummy;
    
    if ((stream >> a >> dummy >> b >> dummy) &&
         std::getline(stream >> std::ws, s))
    //   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    {
    
    }
    

    Clearing the newline after the formatted extraction is also necessary which is why I used std::ws to clear leading whitespace. I'm also using an if statement to contain the extraction in order to tell if it succeeded or not.


    Any smoother way to do this would help me immensely.

    You can set the classification of the comma character to a whitespace character using the std::ctype<char> facet of the locale imbued in the stream. This will make the use of a dummy variable unnecessary. Here's an example:

    namespace detail
    {
        enum options { add, remove };
    
        class ctype : public std::ctype<char>
        {
        private:
            static mask* get_table(const std::string& ws, options opt)
            {
                static std::vector<mask> table(classic_table(),
                                               classic_table() + table_size);
                for (char c : ws)
                {
                    if (opt == add)
                        table[c] |= space;
                    else if (opt == remove)
                        table[c] &= ~space;
                }
                return &table[0];
            }
        public:
            ctype(const std::string& ws, options opt)
                : std::ctype<char>(get_table(ws, opt)) { }
        };
    }
    
    class adjustws_impl
    {
    public:
        adjustws_impl(const std::string& ws, detail::options opt) :
            m_ws(ws),
            m_opt(opt)
        { }
    
        friend std::istream& operator>>(std::istream& is,
                                        const adjustws_impl& manip)
        {
            const detail::ctype* facet(new detail::ctype(manip.m_ws, manip.m_opt));
    
            if (!std::has_facet<detail::ctype>(is.getloc())
            {
                is.imbue(std::locale(is.getloc(), facet));
            } else
                delete facet;
    
            return is;
        }
    private:
        std::string m_ws;
        detail::options m_opt;
    };
    
    adjustws_impl setws(const std::string& ws)
    {
        return adjustws_impl(ws, detail::add);
    }
    
    adjustws_impl unsetws(const std::string& ws)
    {
        return adjustws_impl(ws, detail::remove);
    }
    
    int main()
    {
        std::istringstream iss("10,1.546,Apple 1");
        int a; double b; std::string c;
    
        iss >> setws(","); // set comma to a whitespace character
    
        if ((iss >> a >> b) && std::getline(iss >> std::ws, c))
        {
            // ...
        }
    
        iss >> unsetws(","); // remove the whitespace classification
    }