After Cppcheck was complaining about "%u"
as the wrong format specifier to scan into an int
variable, I changed the format into "%d"
, but when having a second look on it before committing the change, I thought that the intention could be to prevent for negative inputs. I wrote two small programs to see the difference:
#include <iostream>
#include <stdlib.h>
using namespace std;
int main() {
const char* s = "-4";
int value = -1;
int res = sscanf(s, "%d", &value);
cout << "value:" << value << endl;
cout << "res:" << res << endl;
return 0;
}
see also https://ideone.com/OR3IKN
#include <iostream>
#include <stdlib.h>
using namespace std;
int main() {
const char* s = "-4";
int value = -1;
int res = sscanf(s, "%u", &value);
cout << "value:" << value << endl;
cout << "res:" << res << endl;
return 0;
}
see also https://ideone.com/WPWdqi
Surprisingly in both conversion specifiers accept the sign:
value:-4
res:1
I had a look into the documentation on cppreference.com. For C (scanf, fscanf, sscanf, scanf_s, fscanf_s, sscanf_s - cppreference.com) as well as C++ (std::scanf, std::fscanf, std::sscanf - cppreference.com) the description for the "%u"
conversion specifier is the same (emphasis mine):
matches an unsigned decimal integer.
The format of the number is the same as expected by strtoul() with the value 10 for the base argument.
Is the observed behaviour standard complient? Where can I find this documented?
I read that it was simply UB, well, to add to the confusion, here is the version declaring value as unsigned
https://ideone.com/nNBkqN - I think the assignment of -1
is still as expected, but "%u" obviously still matches the sign:
#include <iostream>
#include <stdlib.h>
using namespace std;
int main() {
const char* s = "-4";
unsigned value = -1;
cout << "value before:" << value << endl;
int res = sscanf(s, "%u", &value);
cout << "value after:" << value << endl;
cout << "res:" << res << endl;
return 0;
}
Result:
value before:4294967295
value after:4294967292
res:1
There are two separate issues.
%u
expects a unsigned int*
argument; passing a int*
is UB.%u
match -4
? Yes. The expected format is that of strtoul
with base 10, and if you read the documentation it's quite clear that a leading minus sign is allowed.