I want to understand the following code:
//...
#define _C 0x20
extern const char *_ctype_;
//...
__only_inline int iscntrl(int _c)
{
return (_c == -1 ? 0 : ((_ctype_ + 1)[(unsigned char)_c] & _C));
}
It originates from the file ctype.h from the obenbsd operating system source code. This function checks if a char is a control character or a printable letter inside the ascii range. This is my current chain of thought:
Somehow, strangely, it works and everytime when 0 is returned the given char _c is not a printable character. Otherwise when it's printable the function just returns an integer value that's not of any special interest. My problem of understanding is in step 3, 4 (a bit) and 5.
Thank you for any help.
_ctype_
appears to be a restricted internal version of the symbol table and I'm guessing the + 1
is that they didn't bother saving index 0
of it since that one isn't printable. Or possibly they are using a 1-indexed table instead of 0-indexed as is custom in C.
The C standard dictates this for all ctype.h functions:
In all cases the argument is an
int
, the value of which shall be representable as anunsigned char
or shall equal the value of the macroEOF
Going through the code step by step:
int iscntrl(int _c)
The int
types are really characters, but all ctype.h functions are required to handle EOF
, so they must be int
.-1
is a check against EOF
, since it has the value -1
._ctype+1
is pointer arithmetic to get an address of an array item.[(unsigned char)_c]
is simply an array access of that array, where the cast is there to enforce the standard requirement of the parameter being representable as unsigned char
. Note that char
can actually hold a negative value, so this is defensive programming. The result of the []
array access is a single character from their internal symbol table.&
masking is there to get a certain group of characters from the symbol table. Apparently all characters with bit 5 set (mask 0x20) are control characters. There's no making sense of this without viewing the table.