so below I've written code to convert a binary value into its double equivalent. However,
binConversionDouble("0111111111111111111111111111111111111111111111111111111111111111")
gives:
179769313486149841153851976955417028335471708986564534469802644816832731470450734933617524554272984142853535521554773891036051644209223511432842829043130247900910146729889177843938143131935935774382721844130004287345894163215696051477671359689698349651554878795894806567601614971014045300870721438977577975808.000000
but the actual value of DBL_MAX is:
179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000000
My logic seems to be correct however I am unsure whether the above is a case of loss of precision in arithmetical calculations or a flaw in my logic. Can someone please guide me as to why the two values are different?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <assert.h>
#include <limits.h>
#include <math.h>
#include <float.h>
double binConvertDouble(const double *mainBinArr, const char usersBinVal[]) {
double isNegative = 1;
if (usersBinVal[0] == '1') {isNegative = -1;}
double exponent = 0;
char exponentSubString[12];
int count = 0;
for(int i =0; i<11; i++) {
exponentSubString[i] = usersBinVal[i+1];
count = count + 1;
}
exponentSubString[11] = '\0';
exponent = calculatingDoublesExp(mainBinArr, exponentSubString);
double summingMantissa = 0;
for(int i =12; i<52; i++) {
if (usersBinVal[i] == '1') {
summingMantissa = summingMantissa + pow(2, 11-i);
}
}
double totalMantissaVal = 0;
totalMantissaVal = 1 + summingMantissa;
double actualExp = pow(2, (exponent-1023));
double finalVal = isNegative * totalMantissaVal * actualExp;
return finalVal;
}
int main() {
printf("VALUE WE GET %f\n", binConversionDouble("0111111111111111111111111111111111111111111111111111111111111111"));
return 0;
}
The example input string "0111111111111111111111111111111111111111111111111111111111111111"
corresponds to an IEEE-754 double NaN (Not-a-Number) value. The exponent part of the string is "11111111111"
corresponding to a (radix-2) exponent of 1024 plus the zero-offset bias of 1023. However the maximum (radix-2) exponent of an IEEE-754 finite, numeric value is 1023. The out-of-range exponent causes pow(2, exponent-1023)
to return inf
when exponent
is 2047 (exponent
includes the zero-offset of 1023 here).
The IEEE-754 DBL_MAX
value is represented by the input string "0111111111101111111111111111111111111111111111111111111111111111"
.
The code that extracts the significand (mantissa) value from the input string is terminating too early:
double summingMantissa = 0;
for(int i =12; i<52; i++) {
if (usersBinVal[i] == '1') {
summingMantissa = summingMantissa + pow(2, 11-i);
}
}
The terminating condition should be i<64
.