Search code examples
c++gamma-function

How can i optimize this S-curve function?


I am working on a gamma function that generates a "S-Curve". I need to run it in a realtime environment so i need to speed it up as much as possible.

The code is as follows:

float Gamma = 2.0f; //Input Variable

float GammaMult = pow(0.5f, 1.0f-Gamma);
if(Input<1.0f && Input>0.0f)
{
    if(Input<0.5f)
    {
        Output = pow(Input,Gamma)*GammaMult;
    }
    else
    {
        Output  = 1.0f-pow(1.0f-Input,Gamma)*GammaMult;
    }
}
else
{
   Output  = Input;
}

Is there any way I can optimize this code?


Solution

  • You can avoid pipeline stalls by eliminating branching on Input<1.0f && Input>0.0f if the instruction set supports saturation arithmetic or use max/min intrinsics, e.g. x86 MAXSS

    You should also eliminate the other branching via rounding the saturated Input. Full algorithm:

    float GammaMult = pow(0.5f, 1.0f-Gamma);
    Input = saturate(Input); // saturate via assembly or intrinsics
    // Input is now in [0, 1]
    Rounded = round(Input); // round via assembly or intrinsics
    Coeff = 1 - 2 * Rounded
    Output = Rounded + Coeff * pow(Rounded + Coeff * Input,Gamma)*GammaMult;
    

    Rounding should be done via asm/intrinsics as well.

    If you use this function on e.g. successive values of an array you should consider vectorising it if the target architecture supports SIMD.