I'm tying to compare different ways to get the absolute value of a float/double to find out which one's the fastest because I'll then have to apply this to huge arrays. By using a cast and a bit mask the decimals get lost during the process. (I must use only C)
Here's my code :
uint64_t mask = 0x7fffffffffffffff;
double d1 = -012301923.15126;
double d2 = (double)(((uint64_t)d1) & mask);
And the output is :
d1 = -012301923.15126;
d2 = 012301923.00000;
So the decimals are lost during the conversion, is there a fast way to get them back ?
Thanks in advance.
Edit : I know about fabs(), i'd just like to try and compare different "handmade" solutions.
That's because your cast converts the floating point number to an integer number, which means the decimals are truncated.
What you have is roughly equivalent to
uint64_t temp = (uint64_t) d1;
temp &= mask;
d2 = temp;
You could solve it with type punning using a union
in between:
union
{
uint64_t i;
double d;
} u;
u.d = d1;
u.i &= mask;
d2 = u.d;
As noted by Bathsheba this will in practice work with the big C++ compilers as well. But the C specification explicitly says this is allowed, while the C++ specification says it's undefined (IIRC).