C - Casting ~0 to unsigned long

I’ve seen low-level bitwise expressions that use ~0 to generate a bit-pattern of all 1s, which is then used as masks, etc. For example on K&R page 45:

/* getbits: get n bits from position p */
unsigned getbits(unsigned x, int p, int n)
{
    return (x >> (p+1-n)) & ~(~0 << n);
}

In my machine, (unsigned long) ~0 evaluates to 0x FF FF FF FF FF FF FF FF. This lets us easily generate 1 masks larger than ints and is kinda nice.

However, shouldn’t (unsigned long) ~0 instead evaluate to 0x 00 00 00 00 FF FF FF FF? Without any suffixes, 0 is considered an integer constant, so ~0 evaluates to 0x FF FF FF FF. Why doesn’t casting this to unsigned long result in a zero-padded value? What am I missing here?

Edit: On my machine, sizeof(int) and sizeof(long) are 4 and 8, respectively.

Solution

On your system ~0 is an int with bit pattern FF FF FF FF which represent the decimal value -1.

When you cast to unsigned long you convert one integer type to another integer. The rules for that can be found in 6.3.1.3 of the C (draft) standard n1570:

1 When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.

2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.⁶⁰⁾

_{60) The rules describe arithmetic on the mathematical value, not the value of a given type of expression.}

The first rule doesn't apply as -1 can't be represented in unsigned long. The second rule aplies, i.e. the new value is found like

new-value = original-value + (ULONG_MAX + 1)

In your case:

new-value = -1 + (ULONG_MAX + 1) = ULONG_MAX

so the new value is ULONG_MAX which has the representation FF FF FF FF FF FF FF FF on your system.