I am confused about CHAR_BIT in limits.h. I have read some articles saying the macro CHAR_BIT is there for portability. To use the macro not a magic number like 8 in code, this is reasonable. But limits.h is from glibc-headers and it's value is fixed as 8. If glibc-headers is installed on a system on which a byte has more than 8 bits (say 16 bits), is that wrong when compiling? A 'char' is assigned 8 bits or 16 bits?
And when I modified CHAR_BIT to 9 in limits.h, the following code still prints '8', how?
#include <stdio.h>
#include <limits.h>
int
main(int argc, char **argv)
{
printf("%d\n", CHAR_BIT);
return 0;
}
The following is supplementary:
I've read all replies so for, but still not clear. In practice, #include <limits.h>
and use CHAR_BIT, I can obey that. But that's another thing. Here I want to know why it appears that way, first it is a fixed value '8' in glibc /usr/include/limits.h, what happens when those systems which has 1 byte != 8 bits are installed with glibc; then I found the value '8' is not even the real value the code is using, so '8' means nothing there? Why put '8' there if the value is not used at all?
Thanks,
Diving into system header files can be a daunting and unpleasant experience. glibc header files can easily create a lot of confusion in your head, because they include other system header files under certain circumstances that override what has been defined so far.
In the case of limits.h
, if you read the header file carefully, you will find that the definition for CHAR_BIT
is only used when you compile code without gcc, since this line:
#define CHAR_BIT 8
Is inside an if
condition a few lines above:
/* If we are not using GNU CC we have to define all the symbols ourself.
Otherwise use gcc's definitions (see below). */
#if !defined __GNUC__ || __GNUC__ < 2
Thus, if you compile your code with gcc, which is most likely the case, this definition for CHAR_BIT
will not be used. That's why you change it and your code still prints the old value. Scrolling down a little bit on the header file, you can find this for the case that you're using GCC:
/* Get the compiler's limits.h, which defines almost all the ISO constants.
We put this #include_next outside the double inclusion check because
it should be possible to include this file more than once and still get
the definitions from gcc's header. */
#if defined __GNUC__ && !defined _GCC_LIMITS_H_
/* `_GCC_LIMITS_H_' is what GCC's file defines. */
# include_next <limits.h>
include_next
is a GCC extension. You can read about what it does in this question: Why would one use #include_next in a project?
Short answer: it will search for the next header file with the name you specify (limits.h
in this case), and it will include GCC's generated limits.h
. In my system, it happens to be /usr/lib/gcc/i486-linux-gnu/4.7/include-fixed/limits.h
.
Consider the following program:
#include <stdio.h>
#include <limits.h>
int main(void) {
printf("%d\n", CHAR_BIT);
return 0;
}
With this program, you can find the path for your system with the help of gcc -E
, which outputs a special line for each file included (see http://gcc.gnu.org/onlinedocs/cpp/Preprocessor-Output.html)
Because #include <limits.h>
is on line 2 of this program, which I named test.c
, running gcc -E test.c
allows me to find the real file that is being included:
# 2 "test.c" 2
# 1 "/usr/lib/gcc/i486-linux-gnu/4.7/include-fixed/limits.h" 1 3 4
You can find this in that file:
/* Number of bits in a `char'. */
#undef CHAR_BIT
#define CHAR_BIT __CHAR_BIT__
Note the undef
directive: it is needed to override any possible previous definitions. It is saying: "Forget whatever CHAR_BIT
was, this is the real thing". __CHAR_BIT__
is a gcc predefined constant. GCC's online documentation describes it in the following way:
__CHAR_BIT__
Defined to the number of bits used in the representation of the char data type. It exists to make the standard header given numerical limits work correctly. You should not use this macro directly; instead, include the appropriate headers.
You can read its value with a simple program:
#include <stdio.h>
#include <limits.h>
int main(void) {
printf("%d\n", __CHAR_BIT__);
return 0;
}
And then running gcc -E code.c
. Note that you shouldn't use this directly, as gcc's manpage mentions.
Obviously, if you change CHAR_BIT
definition inside /usr/lib/gcc/i486-linux-gnu/4.7/include-fixed/limits.h
, or whatever the equivalent path is in your system, you will be able to see this change in your code. Consider this simple program:
#include <stdio.h>
#include <limits.h>
int main(void) {
printf("%d\n", CHAR_BIT);
return 0;
}
Changing CHAR_BIT
definition in gcc's limits.h
(that is, the file in /usr/lib/gcc/i486-linux-gnu/4.7/include-fixed/limits.h
) from __CHAR_BIT__
to 9 will make this code print 9. Again, you can stop the compilation process after preprocessing takes place; you can test it with gcc -E
.
What if you're compiling code with a compiler other than gcc?
Well, then be it, default ANSI limits are assumed for standard 32-bit words. From paragraph 5.2.4.2.1 in ANSI C standard (sizes of integral types <limits.h>
):
The values given below shall be replaced by constant expressions suitable for use in #if preprocessing directives. [...] Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown, with the same sign.
number of bits for smallest object that is not a bit-field (byte)
CHAR_BIT 8
POSIX mandates that a compliant platform have CHAR_BIT == 8
.
Of course, glibc's assumptions can go wrong for machines which do not have CHAR_BIT == 8
, but note that you must be under an unsual architecture AND not use gcc AND your platform is not POSIX compliant. Not very likely.
Remember, however, that "implementation defined" means that the compiler writer chooses what happens. Thus, even if you're not compiling with gcc
, there is a chance that your compiler has some sort of __CHAR_BIT__
equivalent defined. Even though glibc will not use it, you can do a little research and use your compiler's definition directly. This is generally bad practice - you will be writing code that is geared towards a specific compiler.
Keep in mind that you should never be messing with system header files. Very weird things can happen when you compile stuff with wrong and important constants like CHAR_BIT
. Do this for educational purposes only, and always restore the original file back.