The question How to use correctly the return value from std::cin.get()
and std::cin.peek()
? made me wonder if it is guaranteed that
std::char_traits<char>::to_int_type(c) == static_cast<int>(c)
for all valid char
values c
.
This comes up in a lot of places. For example, istream::peek
calls streambuf::sgetc
, which uses to_int_type
to convert the char
value into int_type
. Now, does std::cin.peek() == '\n'
really mean that the next character is \n
?
Here's my analysis. Let's collect the pieces from [char.traits.require] and [char.traits.specializations.char]:
For every int
value e
, to_char_type(e)
returns
c
, if eq_int_type(e, to_int_type(c))
for some c
;
some unspecified value otherwise.
For every pair of int
values e
and f
, eq_int_type(e, f)
returns
eq(c, d)
, if e == to_int_type(c)
and f == to_int_type(d)
for some c
and d
;
true
, if e == eof()
and f == eof()
;
false
, if e == eof()
xor f == eof()
;
unspecified otherwise.
eof()
returns a value e
such that !eq_int_type(e, to_int_type(c))
for all c
.
eq(c, d)
iff (unsigned char) c == (unsigned char) d
.
Now, consider this hypothetical implementation: (syntactically simplified)
// char: [-128, 127]
// unsigned char: [0, 255]
// int: [-2^31, 2^31-1]
#define EOF INT_MIN
char to_char_type(int e) {
return char(e - 1);
}
int to_int_type(char c) {
return int(c) + 1;
}
bool eq(char c, char d) {
return c == d;
}
bool eq_int_type(int c, int d) {
return c == d;
}
int eof() {
return EOF;
}
Note that
(property 1) the conversion from unsigned char
to int
is value-preserving;
(property 2) the conversion from char
to unsigned char
is bijective.
Now let's verify the requirements:
For every int
value e
, if eq_int_type(e, to_int_type(c))
for some c
, then e == int(c) + 1
. Therefore, to_char_type(e) == char(int(c)) == c
.
For every pair of int
values e
and f
, if e == to_int_type(c)
and f == to_int_type(d)
for some c
and d
, then eq_int_type(e, f)
iff int(c) + 1 == int(d) + 1
iff c == d
(by property 1). The EOF cases are also trivially verifiable.
For every char
value c
, int(c) >= -128
, so int(c) + 1 != EOF
. Therefore, !eq_int_type(eof(), to_int_type(c))
.
For every pair of char
values c
and d
, eq(c, d)
iff (unsigned char) c == (unsigned char d)
(by property 2).
Does that mean this implementation is conforming, and therefore std::cin.peek() == '\n'
does not do what it is supposed to do? Did I miss anything in my analysis?
Does that mean this implementation is conforming, and therefore std::cin.peek() == '\n' does not do what it is supposed to do?
I agree with your analysis. This isn't guaranteed.
It appears that you would have to use eq_int_type(std::cin.peek(), to_int_type('\n'))
to guarantee correct result.
P.S. Your to_char_type(EOF)
has undefined behaviour due to signed overflow in INT_MIN - 1
. Sure, the value is unspecified in this case, but you still cannot have UB. This would be valid:
char to_char_type(int e) {
return e == EOF
? 0 // doesn't matter
: char(e - 1);
}
to_int_type
would have UB on systems where int and char are same size in case c == INT_MAX
, but you've excluded those systems with the hypothetical sizes.