Fast way to get a close power-of-2 number (floating-point)

In numerical computation, it is often needed to scale numbers to be in safe range.

For example, computing Euclidean distance: sqrt(a^2+b^2). Here, if magnitude of a or b is too small/large, then underflow/overflow can happen.

A common approach to solve this is to divide numbers by the largest magnitude number. However, this solution is:

slow (division is slow)
causes a little extra inaccuracy

So I thought that instead of dividing by the largest magnitude number, let's multiply it with a close power-of-2 reciprocal number. This seems a better solution, as:

multiplication is much faster than division
better accuracy, as multiplying with a power-of-2 number is exact

So, I'd like to create a small utility function, which has a logic like this (by ^, I mean exponentiation):

void getScaler(double value, double &scaler, double &scalerReciprocal) {
    int e = <exponent of value>;
    if (e<-1022) { scaler=2^-1022; scalerReciprocal = 2^1022; }
    } else if (e>1022) { scaler=2^1022; scalerReciprocal = 2^-1022; }
    } else { scaler=2^e; scalerReciprocal = 2^(2046-e); }
}

This function should return a normalized scaler & scalerReciprocal, both are power-of-2 numbers, where scaler is near to value, and scalerReciprocal is the reciprocal of scaler.

The maximum allowed exponents for scaler/scaleReciprocal are -1022..1022 (I don't want to work with subnormal scaler, as subnormal numbers can be slow).

What would be a fast way to do this? Can this be done with pure floating-point operations? Or should I extract the exponent from value, and use simple ifs to do the logic? Is there some kind of trick to do the comparison with (-)1022 fast (as the range is symmetric)?

Note: scaler doesn't need to be the closest possible power-of-2. If some logic needs it, scaler can be some small power-of-2 away from the closest value.

Solution

Based on wim's answer, here's another solution, which can be faster, as it has one less instruction. The output is a little bit different, but still fulfills the requirements.

The idea is to use bit operations to fix border cases: put a 01 to the lsb of the exponent, no matter of its value. So, exponent:

0 becomes 1 (-1023 becomes -1022)
2046 becomes 2045 (1023 becomes 1022)
other exponents modified as well, but just slightly: the number can become two times larger compared to wim's solution (when exponent lsb changes from 00 to 01), or halved (when 10->01) or 1/4 (when 11->01)

So, this modified routine works (and I think that it's pretty cool that the problem can be solved with only 2 fast asm instructions):

#include<stdio.h>
#include<stdint.h>
#include<immintrin.h>
/* gcc -Wall -m64 -O3 -march=sandybridge dbl_scale.c */

union dbl_int64{
    double d;
    uint64_t i;
};

double get_scale(double t){
    union dbl_int64 x;
    uint64_t and_i;
    uint64_t or_i;
         /* 0xFEDCBA9876543210 */
    and_i = 0x7FD0000000000000ull;
    or_i =  0x0010000000000000ull;
    x.d = t;
    x.i = (x.i & and_i)|or_i;                     /* Set fraction bits to zero, take absolute value */
    return x.d;
}

double get_scale_x86(double t){
    __m128d x = _mm_set_sd(t);
    __m128d x_and = _mm_castsi128_pd(_mm_set1_epi64x(0x7FD0000000000000ull));
    __m128d x_or  = _mm_castsi128_pd(_mm_set1_epi64x(0x0010000000000000ull));
            x     = _mm_and_pd(x, x_and);
            x     = _mm_or_pd(x, x_or);
    return _mm_cvtsd_f64(x);
}

/* Compute the inverse 1/t of a double t with all zero fraction bits     */
/* and exponent between the limits of function get_scale                 */
/* A single integer subtraction is much less expensive than a            */
/* floating point division.                                               */
double inv_of_scale(double t){
    union dbl_int64 x;
                     /* 0xFEDCBA9876543210 */
    uint64_t inv_mask = 0x7FE0000000000000ull;
    x.d = t;
    x.i = inv_mask - x.i;
    return x.d;
}

double inv_of_scale_x86(double t){
    __m128i inv_mask = _mm_set1_epi64x(0x7FE0000000000000ull);
    __m128d x        = _mm_set_sd(t);
    __m128i x_i      = _mm_sub_epi64(inv_mask, _mm_castpd_si128(x));
    return _mm_cvtsd_f64(_mm_castsi128_pd(x_i));
}


int main(){
    int n = 14;
    int i;
    /* Several example values, 4.94e-324 is the smallest subnormal */
    double y[14] = { 4.94e-324, 1.1e-320,  1.1e-300,  1.1e-5,  0.7,  1.7,  123.1, 1.1e300,  
                     1.79e308, -1.1e-320,    -0.7, -1.7, -123.1,  -1.1e307};
    double z, s, u;

    printf("Portable code:\n");
    printf("             x       pow_of_2        inverse       pow2*inv      x*inverse \n");
    for (i = 0; i < n; i++){  
        z = y[i];
        s = get_scale(z);
        u = inv_of_scale(s);
        printf("%14e %14e %14e %14e %14e\n", z, s, u, s*u, z*u);
    }

    printf("\nx86 specific SSE code:\n");
    printf("             x       pow_of_2        inverse       pow2*inv      x*inverse \n");
    for (i = 0; i < n; i++){  
        z = y[i];
        s = get_scale_x86(z);
        u = inv_of_scale_x86(s);
        printf("%14e %14e %14e %14e %14e\n", z, s, u, s*u, z*u);
    }

    return 0;
}