I'm doing some C puzzle questions. In most cases, I am able to find the right answer, but with that one I am having problems. I know the right answer by using the compiler, but I don't know the reason.
Have a look at the code:
char c[] = "abc\012\0x34";
What would strlen(c)
return, using a Standard C compiler?
My compiler returns 4 when what I expected was 3.
What I thought is strlen()
would search for the first occurrence of the NULL
character but somehow the result is one more than I expected.
Any idea why?
Let's write
char c[] = "abc\012\0x34";
with single characters:
char c[] = { 'a', 'b', 'c', '\012', '\0', 'x', '3', '4', '\0' };
The first \0
you see is the start of an octal escape sequence \012
that extends over the following octal digits.
Octal escape sequences are specified in section 6.4.4.4 of the standard (N1570 draft):
octal-escape-sequence:
\
octal-digit
\
octal-digit octal-digit
\
octal-digit octal-digit octal-digit
they consist of a backslash followed by one, two, or three octal digits. In paragraph 7 of that section, the extent of octal and hexadecimal escape sequences is given:
7 Each octal or hexadecimal escape sequence is the longest sequence of characters that can constitute the escape sequence.
Note that while the length of an octal escape sequence is limited to at most three octal digits (thus "\123456"
consists of five characters, { '\123', '4', '5', '6', '\0' }
), hexadecimal escape sequences have unlimited length
hexadecimal-escape-sequence:
\x
hexadecimal-digit
hexadecimal-escape-sequence hexadecimal-digit
and thus "\x123456789abcdef"
consists of only two characters ({ '\x123456789abcdef', '\0' }
).