Search code examples
csizeofbitwise-and

sizeof operator fails on bitwise & in C


Thanks for your response, the %zu specifiers work okay with the sizeof operator, size gets printed okay and x and y values also. But when I changed all of those specifiers to %lld, which should print actual x and y values(as they are long long), it prints some random garbage values.

    #include<stdio.h>

    int main()
    {
      long long x=500000000007;
      long long y=1;
      printf("size of x : %lld,x = %lld, size of y : %lld, y : %lld\n",sizeof(x),x,sizeof(y),y); 

      if(x & y)
      {
        printf("ODD IT IS :), x&1 = %lld, size of x&1 = %lld\n",x&y,sizeof(x&y) );//<--- this line
      }



      printf("After If statement ---> size of x : %lld,x = %lld\n",sizeof(x),x); 

      return 0;
    }

    Output on linux 32 bit system(Ubuntu) using gcc compiler
      size of x : 7661335479756783624,x = 34359738484, size of y : 1, y : -4160453359
      ODD IT IS :), x&1 = 1, size of x&1 = 34359738376
      After If statement ---> size of x : 7661335479756783624,x = 34359738484

Now, the question is - Why are the values in variables x and y affected by the use of specifier for the sizeof operator Thanks in advance.


Solution

  • Theory

    The problem is that you're not using the correct format specification to print sizeof().

    printf("size of x : %d,x = %lld, size of y : %d, y : %lld\n", sizeof(x), x, sizeof(y), y); 
    

    You're using %d, which expects an int, to print a size_t which is (a) an unsigned type and (b) most probably 8 bytes, not 4 bytes like an int would be. The correct way to print size_t (like the result of sizeof) is with the z modifier:

    printf("size of x: %zu, x = %lld, size of y: %zu, y = %lld\n", sizeof(x), x, sizeof(y), y);
    

    If you don't have support for %zu in the library, then cast the result of sizeof(x) etc to int explicitly. However, if you have support for long long, it is unlikely that the library does not support %zu.

    If you use GCC, it should have been warning you about type mismatches in the argument list to printf().

    When you use the wrong type, the wrong values get picked off the stack inside printf(). You push 4 units of 8 bytes on the stack, but tell printf() to read 4 bytes, 8 bytes, 4 bytes and 8 bytes.


    Compiling and fixing the code

    Taking your original code, sz1.c, with minor modifications:

    #include <stdio.h>
    
    int main(void)
    {
        long long x=500000000007;
        long long y=1;
    
        printf("size of x : %d, x = %lld, size of y : %d, y : %lld\n", sizeof(x), x, sizeof(y), y); 
        printf("ODD IT IS :), x&1 = %d, size of x&1 = %lld\n", x&y, sizeof(x&y));
        printf("After If statement ---> size of x : %d, x = %lld\n", sizeof(x), x); 
    
        return 0;
    }
    

    Then compiling it gives me lots of warnings (this is GCC 4.8.1, and for some reason, it produces two copies of each warning for format warnings — it is bad, but not quite that bad!):

    $ gcc -g -std=c99 -Wall -Wextra -Wmissing-prototypes -Wstrict-prototypes -Wold-style-definition     sz1.c -o sz1
    sz1.c: In function ‘main’:
    sz1.c:8:5: warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘long unsigned int’ [-Wformat=]
         printf("size of x : %d, x = %lld, size of y : %d, y : %lld\n", sizeof(x), x, sizeof(y), y); 
         ^
    sz1.c:8:5: warning: format ‘%d’ expects argument of type ‘int’, but argument 4 has type ‘long unsigned int’ [-Wformat=]
    sz1.c:8:5: warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘long unsigned int’ [-Wformat=]
    sz1.c:8:5: warning: format ‘%d’ expects argument of type ‘int’, but argument 4 has type ‘long unsigned int’ [-Wformat=]
    sz1.c:9:5: warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘long long int’ [-Wformat=]
         printf("ODD IT IS :), x&1 = %d, size of x&1 = %lld\n", x&y, sizeof(x&y));
         ^
    sz1.c:9:5: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has type ‘long unsigned int’ [-Wformat=]
    sz1.c:9:5: warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘long long int’ [-Wformat=]
    sz1.c:9:5: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has type ‘long unsigned int’ [-Wformat=]
    sz1.c:10:5: warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘long unsigned int’ [-Wformat=]
         printf("After If statement ---> size of x : %d, x = %lld\n", sizeof(x), x); 
         ^
    sz1.c:10:5: warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘long unsigned int’ [-Wformat=]
    $
    

    This version, sz2.c, has the correct type specifiers (so it compiles without any warnings):

    #include <stdio.h>
    
    int main(void)
    {
        long long x=500000000007;
        long long y=1;
        printf("size of x : %zu, x = %lld, size of y : %zu, y : %lld\n", sizeof(x), x, sizeof(y), y); 
        printf("ODD IT IS :), x&1 = %lld, size of x&1 = %zu\n", x&y, sizeof(x&y) );//<--- this line
        printf("After If statement ---> size of x : %zu, x = %lld\n", sizeof(x), x); 
        return 0;
    }
    

    Interestingly, on Mac OS X 10.9.4 with GCC 4.8.1, the output of the two programs is basically the same:

    $ ./sz1 
    size of x : 8, x = 500000000007, size of y : 8, y : 1
    ODD IT IS :), x&1 = 1, size of x&1 = 8
    After If statement ---> size of x : 8, x = 500000000007
    $ ./sz2
    size of x : 8, x = 500000000007, size of y : 8, y : 1
    ODD IT IS :), x&1 = 1, size of x&1 = 8
    After If statement ---> size of x : 8, x = 500000000007
    $
    

    However, compile the same code as 32-bit instead of 64-bit executable, and then there are differences:

    $  ./sz1 
    size of x : 8, x = 500000000007, size of y : 8, y : 1
    ODD IT IS :), x&1 = 1, size of x&1 = 34359738368
    After If statement ---> size of x : 8, x = 500000000007
    $ ./sz2
    size of x : 8, x = 500000000007, size of y : 8, y : 1
    ODD IT IS :), x&1 = 1, size of x&1 = 8
    After If statement ---> size of x : 8, x = 500000000007
    $
    

    Why?

    Why does it give size as large as 34359738368 if I print it using %lld?

    Because of the way the data is stacked. Assuming that you have a 32-bit compilation, the 'ODD' call to printf() pushes:

    1. The pointer to the format.
    2. 8 bytes for the value of x&y (because both x and y are long long, so x&y is also a long long).
    3. 4 bytes for the value of sizeof(x&y) (assuming a 32-bit compilation, with sizeof(size_t) == 4.

    The incorrect format tells printf() that:

    1. It should use 4 bytes off the stack to print the first value (%d) — x&y — instead of the correct 8 bytes.
    2. It should use 8 bytes off the stack to print the second value (%lld) — sizeof(x&y) — instead of the correct 4 bytes.

    Because the machine is little-endian (presumably an Intel machine), the first 4 bytes are the same whether the value is 4 bytes or 8 bytes, but the misinterpretation of the rest occurs because the 4 bytes of zeros from the end of the x&y value are treated as part of the value. Change the format from %lld to 0x%llX (in the original code) and from %zu to 0x%zX, and the ODD lines change:

    ODD IT IS :), x&1 = 1, size of x&1 = 0x800000000
    
    ODD IT IS :), x&1 = 1, size of x&1 = 0x8
    

    There are 8 zeros after the 0x8.