Search code examples
c++stlcoding-styleunsigned

acceptable fix for majority of signed/unsigned warnings?


I myself am convinced that in a project I'm working on signed integers are the best choice in the majority of cases, even though the value contained within can never be negative. (Simpler reverse for loops, less chance for bugs, etc., in particular for integers which can only hold values between 0 and, say, 20, anyway.)

The majority of the places where this goes wrong is a simple iteration of a std::vector, often this used to be an array in the past and has been changed to a std::vector later. So these loops generally look like this:

for (int i = 0; i < someVector.size(); ++i) { /* do stuff */ }

Because this pattern is used so often, the amount of compiler warning spam about this comparison between signed and unsigned type tends to hide more useful warnings. Note that we definitely do not have vectors with more then INT_MAX elements, and note that until now we used two ways to fix compiler warning:

for (unsigned i = 0; i < someVector.size(); ++i) { /*do stuff*/ }

This usually works but might silently break if the loop contains any code like 'if (i-1 >= 0) ...', etc.

for (int i = 0; i < static_cast<int>(someVector.size()); ++i) { /*do stuff*/ }

This change does not have any side effects, but it does make the loop a lot less readable. (And it's more typing.)

So I came up with the following idea:

template <typename T> struct vector : public std::vector<T>
{
    typedef std::vector<T> base;

    int size() const     { return base::size(); }
    int max_size() const { return base::max_size(); }
    int capacity() const { return base::capacity(); }

    vector()                  : base() {}
    vector(int n)             : base(n) {}
    vector(int n, const T& t) : base(n, t) {}
    vector(const base& other) : base(other) {}
};

template <typename Key, typename Data> struct map : public std::map<Key, Data>
{
    typedef std::map<Key, Data> base;
    typedef typename base::key_compare key_compare;

    int size() const     { return base::size(); }
    int max_size() const { return base::max_size(); }

    int erase(const Key& k) { return base::erase(k); }
    int count(const Key& k) { return base::count(k); }

    map() : base() {}
    map(const key_compare& comp) : base(comp) {}
    template <class InputIterator> map(InputIterator f, InputIterator l) : base(f, l) {}
    template <class InputIterator> map(InputIterator f, InputIterator l, const key_compare& comp) : base(f, l, comp) {}
    map(const base& other) : base(other) {}
};

// TODO: similar code for other container types

What you see is basically the STL classes with the methods which return size_type overridden to return just 'int'. The constructors are needed because these aren't inherited.

What would you think of this as a developer, if you'd see a solution like this in an existing codebase?

Would you think 'whaa, they're redefining the STL, what a huge WTF!', or would you think this is a nice simple solution to prevent bugs and increase readability. Or maybe you'd rather see we had spent (half) a day or so on changing all these loops to use std::vector<>::iterator?

(In particular if this solution was combined with banning the use of unsigned types for anything but raw data (e.g. unsigned char) and bit masks.)


Solution

  • I made this community wiki... Please edit it. I don't agree with the advice against "int" anymore. I now see it as not bad.

    Yes, i agree with Richard. You should never use 'int' as the counting variable in a loop like those. The following is how you might want to do various loops using indices (althought there is little reason to, occasionally this can be useful).

    Forward

    for(std::vector<int>::size_type i = 0; i < someVector.size(); i++) {
        /* ... */
    }
    

    Backward

    You can do this, which is perfectly defined behaivor:

    for(std::vector<int>::size_type i = someVector.size() - 1; 
        i != (std::vector<int>::size_type) -1; i--) {
        /* ... */
    }
    

    Soon, with c++1x (next C++ version) coming along nicely, you can do it like this:

    for(auto i = someVector.size() - 1; i != (decltype(i)) -1; i--) {
        /* ... */
    }
    

    Decrementing below 0 will cause i to wrap around, because it is unsigned.

    But unsigned will make bugs slurp in

    That should never be an argument to make it the wrong way (using 'int').

    Why not use std::size_t above?

    The C++ Standard defines in 23.1 p5 Container Requirements, that T::size_type , for T being some Container, that this type is some implementation defined unsigned integral type. Now, using std::size_t for i above will let bugs slurp in silently. If T::size_type is less or greater than std::size_t, then it will overflow i, or not even get up to (std::size_t)-1 if someVector.size() == 0. Likewise, the condition of the loop would have been broken completely.