Referring to this guide:
https://google.github.io/styleguide/cppguide.html#Integer_Types
Google suggests to use int
in the most of time.
I try to follow this guide and the only problem is with STL containers.
Example 1.
void setElement(int index, int value)
{
if (index > someExternalVector.size()) return;
...
}
Comparing index
and .size()
is generating a warning.
Example 2.
for (int i = 0; i < someExternalVector.size(); ++i)
{
...
}
Same warning between i
and .size()
.
If I declare index
or i
as unsigned int
, the warning is off, but the type declaration will propagate, then I have to declare more variables as unsigned int
, then it contradicts the guide and loses consistency.
The best way I can think is to use a cast like:
if (index > static_cast<int>(someExternalVector.size())
or
for (int i = 0; i < static_cast<int>(someExternalVector.size()); ++i)
But I really don't like the casts.
Any suggestion?
Some detailed thoughts below:
To advantage to use only signed integer is like: I can avoid signed/unsigned warnings, castings, and be sure every value can be negative(to be consistent), so -1 could be used to represent invalid values.
There are many cases that the usage of loop counters are mixed with some other constants or struct members. So it would be problematic if signed/unsigned is not consistent. There will be full of warnings and castings.
Unsigned types have three characteristics, one of which is qualitatively 'good' and one of which is qualitatively 'bad':
size_t
version (that is, 32-bit on a 32-bit machine, 64-bit on a 64-bit machine, etc) is useful for representing memory (addresses, sizes, etc) (neutral)The STL uses unsigned types because of the first two points above: in order to not limit the potential size of array-like classes such as vector
and deque
(although you have to question how often you would want 4294967296 elements in a data structure); because a negative value will never be a valid index into most data structures; and because size_t
is the correct type to use for representing anything to do with memory, such as the size of a struct, and related things such as the length of a string (see below.) That's not necessarily a good reason to use it for indexes or other non-memory purposes such as a loop variable. The reason it's best practice to do so in C++ is kind of a reverse construction, because it's what's used in the containers as well as other methods, and once used the rest of the code has to match to avoid the same problem you are encountering.
You should use a signed type when the value can become negative.
You should use an unsigned type when the value cannot become negative (possibly different to 'should not'.)
You should use size_t
when handling memory sizes (the result of sizeof
, often things like string lengths, etc.) It is often chosen as a default unsigned type to use, because it matches the platform the code is compiled for. For example, the length of a string is size_t
because a string can only ever have 0 or more elements, and there is no reason to limit a string's length method arbitrarily shorter than what can be represented on the platform, such as a 16-bit length (0-65535) on a 32-bit platform. Note (thanks commenter Morwen) std::intptr_t
or std::uintptr_t
which are conceptually similar - will always be the right size for your platform - and should be used for memory addresses if you want something that's not a pointer. Note 2 (thanks commenter rubenvb) that a string can only hold size_t-1
elements due to the value of npos
. Details below.
This means that if you use -1 to represent an invalid value, you should use signed integers. If you use a loop to iterate backwards over your data, you should consider using a signed integer if you are not certain that the loop construct is correct (and as noted in one of the other answers, they are easy to get wrong.) IMO, you should not resort to tricks to ensure the code works - if code requires tricks, that's often a danger signal. In addition, it will be harder to understand for those following you and reading your code. Both these are reasons not to follow @Jasmin Gray's answer above.
However, using integer-based loops to iterate over the contents of a data structure is the wrong way to do it in C++, so in a sense the argument over signed vs unsigned for loops is moot. You should use an iterator instead:
std::vector<foo> bar;
for (std::vector<foo>::const_iterator it = bar.begin(); it != bar.end(); ++it) {
// Access using *it or it->, e.g.:
const foo & a = *it;
When you do this, you don't need to worry about casts, signedness, etc.
Iterators can be forward (as above) or reverse, for iterating backwards. Use the same syntax of it != bar.end()
, because end()
signals the end of the iteration, not the end of the underlying conceptual array, tree, or other structure.
In other words, the answer to your question 'Should I use int or unsigned int when working with STL containers?' is 'Neither. Use iterators instead.' Read more about:
If you don't use an integer type for loops, what's left? Your own values, which are dependent on your data, but which in your case include using -1 for an invalid value. This is simple. Use signed. Just be consistent.
I am a big believer in using natural types, such as enums, and signed integers fit into this. They match our conceptual expectation more closely. When your mind and the code are aligned, you are less likely to write buggy code and more likely to expressively write correct, clean code.