Search code examples
cc99strict-aliasing

Correct way to write an in-place floating point type promotion loop?


I have a routine that promotes an array of single precision data to double precision in place using that the array is appropriately sized to handle the extra bytes:

void dpromote(const int n, double *x)             
{
    for (int i = n; i --> 0 ;) {    
        x[i] = ((float *)x)[i];
    }
}

On entry, x should contain n floats and on exit it will contain n doubles:

void test_dpromote()
{
    double  e[]   = {1, 2, 3, 4, 5, 6, 7};
    float   x[]   = {1, 2, 3, 4, 5, 6, 7, 0, 0, 0, 0, 0, 0, 0};
    const int n = sizeof(e)/sizeof(e[0]);
    dpromote(n, (void *) x);
    /* memcmp(x, e, sizeof(e)) will return 0 when this works as expected */
}

Why am I doing this? Mixed precision iterative refinement within a numerics-heavy code. For the purposes of this question, you can ignore why as its genuinely irrelevant.

Multiple compilers are okay with the dpromote logic at a variety of aggressive optimization levels with strict aliasing enabled. Recently, one compiler vendor (who shall remain nameless) decided to re-index my loop so that it was a forward traversal through memory instead of a backwards traversal. If you stare at the code for half a minute, you'll see that loop transformation produces utter garbage.

Does the dpromote logic, with all the C99 strict-aliasing bells and whistles enabled, rely on undefined behavior? I can't figure out why a compiler would think it okay to change the loop indexing unless the code is doing undefined things.


Solution

  • Yes, you are breaking the strict aliasing rules. Use a union - it may not maintain your original layout, however it reflects your intent much better and is generally cleaner:

    #include <stdio.h>
    
    union value 
    {
        double d;
        float f;
    };
    
    void dpromote (const int n, value* x)
    {
        for (int i=0; i < n; ++i)
            x[i].d = x[i].f;
    }
    
    void test_dpromote()
    {
        value x[] = {{.f=1}, {.f=2}, {.f=3}, {.f=4}, {.f=5}, {.f=6}, {.f=7}};
        const int n = sizeof(x) / sizeof(x[0]);
    
        for (int i=0; i < n; ++i)
            printf("float: %f\n", x[i].f);
    
        dpromote(n, x);
    
        for (int i=0; i < n; ++i)
            printf("double: %f\n", x[i].d);
    }
    
    int main ()
    {
        test_dpromote();
        return 0;
    }
    

    If you must maintain the original layout then you'll need to manage the block of memory manually to satisfy strict aliasing rules:

    #include <stdio.h>
    #include <string.h>
    #include <stdlib.h>
    
    double* dpromote (const int n, char* x)
    {
        for (int i=n; i-- > 0; )
        {
            float f;
            memcpy(&f, &x[i*sizeof(float)], sizeof(f));
            double d = f;
            memcpy(&x[i*sizeof(double)], &d, sizeof(d));
        }
    
        return (double*)x;
    }
    
    void test_dpromote()
    {
        int const n = 7;
        char* block = (char*)malloc(n*sizeof(double));
        for (int i=0; i < n; ++i)
        {
            float const x = i+1;
            memcpy(&block[i*sizeof(float)], &x, sizeof(x));
        }
    
        // It is now safe to access block through x
    
        float* x = (float*)block;
        for (int i=0; i < n; ++i)
            printf("float: %f\n", x[i]);
    
        double* y = dpromote(n, block);
        for (int i=0; i < n; ++i)
            printf("double: %f\n", y[i]);
    
        // It is now safe to access block through y, however
        // subsequent access through x will violate strict aliasing rules
    }
    
    int main ()
    {
        test_dpromote();
        return 0;
    }