Search code examples
cencodingbinarydoubleieee-754

What happens when the integer value with more than 52-bit of mantissa is stored in the double data type?


#include <stdio.h>

int main() {
    double a =92233720368547758071;
    printf("value=%lf\n", a);
    int i;
    char *b = &a;
    for (i = 0; i < 8; i++) {
       printf("%d byte (%p)value: <%d>\n", i + 1, b,*b);
       b++;
    }
    return 0;
}

(compiler details): gcc (Ubuntu 13.2.0-23ubuntu4) 13.2.0

The above code will generate warnings when compiling because (92233720368547758071) is too large to store in double, and I used char pointer to address the double data type because I want to inspect all the 8 bytes.

(Warnings):

test.c: In function ‘main’:
test.c:6:12: warning: integer constant is too large for its type
    6 | double a = 92233720368547758071;
      |            ^~~~~~~~~~~~~~~~~~~~
test.c:8:17: warning: initialization of ‘char *’ from incompatible pointer type ‘double *’ [-Wincompatible-pointer-types]
    8 | int i;char *b = &a;

Output:

value=18446744073709551616.000000
1 byte (0x7ffda3f173f8)value: <0>
2 byte (0x7ffda3f173f9)value: <0>
3 byte (0x7ffda3f173fa)value: <0>
4 byte (0x7ffda3f173fb)value: <0>
5 byte (0x7ffda3f173fc)value: <0>
6 byte (0x7ffda3f173fd)value: <0>
7 byte (0x7ffda3f173fe)value: <-16>
8 byte (0x7ffda3f173ff)value: <67>

IEEE 754 standard is used to represent the floating point numbers in memory. double data type uses double precision encoding format.

The format contains:

  1. Sign-bit: (1-bit)

  2. Exponent: (11-bit)

  3. Mantissa: (52-bit)

It represents the floating point numbers in 64 bits.

(92233720368547758071) conversion:

Normalised number: [+]1.[0100000000000000000000000000000000000000000000000000]*2^[66]

double precision bias: 1023

Exponent: 1023+66=1089

1) sign-bit: (0)
2) Exponent: (10001000001)
3) Mantissa|Precision: (0100000000000000000000000000000000000000000000000000)

92233720368547758071 ->

01000100 00010100 00000000 00000000  00000000 00000000 00000000 00000000

(18446744073709551616) conversion:

Normalised number:[+]1.[0000000000000000000000000000000000000000000000000000]*2^[64]

double precision bias: 1023

Exponent: 1023+64=1087

1) sign-bit: (0) 
2) Exponent: (10000111111)
3) Mantissa|Precision: (0000000000000000000000000000000000000000000000000000) 

18446744073709551616 ->

01000011 11110000 00000000 00000000 00000000 00000000 00000000 00000000

My system linux uses little endian to store data in memory. Following that we try to decode the data stored in each and every byte we get the same result.

(System Information):

PRETTY_NAME="Ubuntu 24.04 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04 LTS (Noble Numbat)"
VERSION_CODENAME=noble
Hostname=HP-Laptop-15s-fq5xxx
1st byte value (0) which in binary -> (00000000)
2nd byte value (0) which in binary -> (00000000)
3rd byte value (0) which in binary -> (00000000)
4th byte value (0) which in binary -> (00000000)
5th byte value (0) which in binary -> (00000000)
6th byte value (0) which in binary -> (00000000)
7th byte value (-16) which in binary -> (11110000)
8th byte value (67) which in binary -> (01000011)

My question here is how did number 92233720368547758071 get converted, to get stored as 18446744073709551616?

What's happened here? How does this (18446744073709551616) get stored in memory instead of 92233720368547758071?


Solution

  • The declaration double a = 92233720368547758071; is initializing the defined variable a of type double using an integer constant (specifically, an unsuffixed decimal integer constant) as the initializer. This integer constant's value needs to be implicitly converted to double.

    The problem occurs when neither the type long long int nor some extended integer type can represent the value of the unsuffixed decimal integer constant 92233720368547758071. In that case, the integer constant has "no type"1, and that violates the constraint listed in C17 6.4.4 paragraph 2:

    Each constant shall have a type and the value of a constant shall be in the range of representable values for its type.

    Although violation of a constraint does not in itself result in undefined behavior, the fact that the standard does not define how to convert a constant with no type to any type results in undefined behavior. C17 3.4.3 paragraph 1 defines undefined behavior as "behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements".

    The undefined behavior means that the implementation is free to do what it likes. The implementation in use by OP appears to have converted 92233720368547758071 to some intermediate value in the range 18446744073709550592 to 18446744073709553615 inclusive2 of some 64-bit unsigned integer type and then converted that to the value 18446744073709553616.0 of type double (assuming the implementation's double type uses the IEEE 754 binary64 format).


    1 According to C17 6.4.4.1 paragraph 5, the type of an unsuffixed decimal constant is the first type on the following list that can represent it: int, long int, long long int. Paragraph 6 says that if the integer constant cannot be represented by any type in its list, it may have an extended integer type if there is an extended integer type that can represent it. If it cannot be represented by any type in its list or by an extended integer type, then it has no type.

    2 GCC 13.6 seems to convert it by truncating the value to 64 bits, resulting in the value 18446744073709551607. Some other plausible implementation might clamp the value to the largest 64-bit number 18446744073709551615.


    The wildly incorrect conversion could have been avoided by using a floating point constant as the initializer, although the stored value might not exactly match the literal value in the source code. The closest representable value of type double to the literal value 92233720368547758071.0 is exactly 92233720368547758080.0 (assuming IEEE 754 binary64 format).