Search code examples
calgorithmmathrecursion

Power by squaring for negative exponents


I am not sure if power by squaring takes care of negative exponent. I implemented the following code which works for only positive numbers.

    #include <stdio.h>
    int powe(int x, int exp)
    {
         if (x == 0)
            return 1;
         if (x == 1)
            return x;
         if (x&1)
                return powe(x*x, exp/2);
         else
                return x*powe(x*x, (exp-1)/2);       
    }

Looking at https://en.wikipedia.org/wiki/Exponentiation_by_squaring doesn't help as the following code seems wrong.

    Function exp-by-squaring(x, n ) 
      if n < 0  then return exp-by-squaring(1 / x, - n );
      else if n = 0  then return  1;
      else if n = 1  then return  x ; 
      else if n is even  then return exp-by-squaring(x * x,  n / 2);
      else if n is odd  then return x * exp-by-squaring(x * x, (n - 1) / 2).

Edit: Thanks to amit this solution works for both negative and positive numbers:

    float powe(float x, int exp)
    {
            if (exp < 0)
                    return powe(1/x, -exp);
            if (exp == 0)
                    return 1;
            if (exp == 1)
                    return x;
            if (((int)exp)%2==0)
                    return powe(x*x, exp/2);
            else
                    return x*powe(x*x, (exp-1)/2);
    }

For fractional exponent we can do below (Spektre method):

  1. Suppose you have x^0.5 then you easily calculate square root by this method: start from 0 to x/2 and keep checking x^2 is equal to the result or not in binary search method.

  2. So, in case you have x^(1/3) you have to replace if mid*mid <= n to if mid*mid*mid <= n and you will get the cube root of x.Same thing applies for x^(1/4), x^(1/5) and so on. In the case of x^(2/5) we can do (x^(1/5))^2 and again reduce the problem of finding the 5th root of x.

  3. However by this time you would have realized that this method works only in the case when you can convert the root to 1/x format. So are we stuck if we can't convert? No, we can still go ahead as we have the will.

  4. Convert your floating point to fixed point and then calculate pow(a, b). Suppose the number is 0.6, converting this to (24, 8)floating point yields Floor(0.6*1<<8) = 153(10011001). As you know 153 represents the fractional part so in fixed point this (10011001) represent (2^-1, 0, 0, 2^-3, 2^-4, 0, 0, 2^7).So we can again calculate the pow(a, 0.6) by calculating the 2,3, 4 and 7 root of x in fixed point. After calculating we again need to get the result in floating point by dividing with 1<<8.

Code for above method can be found in the accepted answer.

There is also a log based method:

x^y = exp2(y*log2(x))


Solution

  • Integer examples are for 32 bit int arithmetics, DWORD is 32bit unsigned int

    1. Floating pow(x,y)=xy

      is usually evaluated like this:

      so the fractional exponent can be evaluated: pow(x,y) = exp2(y*log2(x)). This can be done also on fixed point:

    2. Integer pow(a,b)=ab where a>=0 , b>=0

      This is easy (you already have that) done by squaring:

           DWORD powuu(DWORD a,DWORD b)
           {   
               int i,bits=32;
               DWORD d=1;
               for (i=0;i<bits;i++)
               {
                   d *= d;
                   if (DWORD(b&0x80000000))
                       d *= a;
                   b<<=1;
               }
               return d;
           }
      
    3. Integer pow(a,b)=ab where b>=0

      Just add few ifs to handle the negative a

           int powiu(int a,DWORD b)
           {
               int sig=0,c;
               if ((a<0)&&(DWORD(b&1))
               {
                   sig=1;
                   a=-a;
               } // negative output only if a<0 and b is odd
              c = powuu(a,b);
              if (sig)
                  c=-c;
              return c;
          }
      
    4. Integer pow(a,b)=ab

      So if b<0 then it means 1/powiu(a,-b) As you can see the result is not integer at all so either ignore this case or return floating value or add a multiplier variable (so you can evaluate PI equations on pure Integer arithmetics). This is float result:

           float powfii(int a,int b)
           {
               if (b < 0)
                  return 1.0/float(powiu(a,-b));
               else
                  return powiu(a,b);
           }
      
    5. Integer pow(a,b)=ab where b is fractional

      You can do something like this a^(1/bb) where bb is integer. In reality this is rooting so you can use binary search to evaluate:

      • a^(1/2) is square root(a)
      • a^(1/bb) is bb_root(a)

      so do a binary search for c from MSB to LSB and evaluate if pow(c,bb)<=a then leave the bit as is else clear it. This is sqrt example:

           int bits(DWORD p) // count how many bits is p
           {
               DWORD m=0x80000000; int b=32;
               for (;m;m>>=1,b--)
                   if (p>=m)
                       break;
               return b;
           }
      
           DWORD sqrt(const DWORD &x)
           {
               DWORD m,a;
               m=(bits(x)>>1);
               if (m)
                   m=1<<m;
               else
                   m=1;
               for (a=0;m;m>>=1)
               {
                   a|=m;
                   if (a*a>x)
                       a^=m;
               }
               return a;
           }
      

      so now just change the if (a*a>x) with if (pow(a,bb)>x) where bb=1/b ... so b is fractional exponent you looking for and bb is integer. Also m is the number of bits of the result so change m=(bits(x)>>1); to m=(bits(x)/bb);

    [edit1] fixed point sqrt example

    //---------------------------------------------------------------------------
    const int _fx32_fract=16;       // fractional bits count
    const int _fx32_one  =1<<_fx32_fract;
    DWORD fx32_mul(const DWORD &x,const DWORD &y)   // unsigned fixed point mul
    {
        DWORD a=x,b=y;              // asm has access only to local variables
        asm
        {                       // compute (a*b)>>_fx32_fract
            mov eax,a               // eax=a
            mov ebx,b               // ebx=b
            mul eax,ebx             // (edx,eax)=eax*ebx
            mov ebx,_fx32_one
            div ebx                 // eax=(edx,eax)>>_fx32_fract
            mov a,eax;
        }
        return a;
    }
    DWORD fx32_sqrt(const DWORD &x) // unsigned fixed point sqrt
    {
        DWORD m,a;
        if (!x)
            return 0;
        m=bits(x);                  // integer bits
        if (m>_fx32_fract)
            m-=_fx32_fract;
        else
            m=0;
        m>>=1;                      // sqrt integer result is half of x integer bits
        m=_fx32_one<<m;             // MSB of result mask
        for (a=0;m;m>>=1)           // test bits from MSB to 0
        {
            a|=m;                   // bit set
            if (fx32_mul(a,a)>x)    // if result is too big
                a^=m;                   // bit clear
        }
        return a;
    }
    //---------------------------------------------------------------------------
    
    so this is unsigned fixed point. High `16` bits are integer and low `16` bits are fractional part.
    
    - this is fp -> fx conversion:  `DWORD(float(x)*float(_fx32_one))`
    - this is fp <- fx conversion:  `float(DWORD(x))/float(_fx32_one))`
    
    • fx32_mul(x,y) is x*y it uses assembler of 80386+ 32bit architecture (you can rewrite it to karatsuba or whatever else to be platform independent)

      • fx32_sqrt(x) is sqrt(x)

      In fixed point you should be aware of the fractional bit shift for multiplication: (a<<16)*(b<<16)=(a*b<<32) you need to shift back by >>16 to get result (a*b<<16). Also the result can overflow 32 bit therefore I use 64 bit result in assembly.

    [edit2] 32bit signed fixed point pow C++ example

    When you put all the previous steps together you should have something like this:

    //---------------------------------------------------------------------------
    //--- 32bit signed fixed point format (2os complement)
    //---------------------------------------------------------------------------
    // |MSB              LSB|
    // |integer|.|fractional|
    //---------------------------------------------------------------------------
    const int _fx32_bits=32;                                // all bits count
    const int _fx32_fract_bits=16;                          // fractional bits count
    const int _fx32_integ_bits=_fx32_bits-_fx32_fract_bits; // integer bits count
    //---------------------------------------------------------------------------
    const int _fx32_one       =1<<_fx32_fract_bits;         // constant=1.0 (fixed point)
    const float _fx32_onef    =_fx32_one;                   // constant=1.0 (floating point)
    const int _fx32_fract_mask=_fx32_one-1;                 // fractional bits mask
    const int _fx32_integ_mask=0xFFFFFFFF-_fx32_fract_mask; // integer bits mask
    const int _fx32_sMSB_mask =1<<(_fx32_bits-1);           // max signed bit mask
    const int _fx32_uMSB_mask =1<<(_fx32_bits-2);           // max unsigned bit mask
    //---------------------------------------------------------------------------
    float fx32_get(int   x) { return float(x)/_fx32_onef; }
    int   fx32_set(float x) { return int(float(x*_fx32_onef)); }
    //---------------------------------------------------------------------------
    int fx32_mul(const int &x,const int &y) // x*y
    {
        int a=x,b=y;                // asm has access only to local variables
        asm
        {                       // compute (a*b)>>_fx32_fract
            mov eax,a
            mov ebx,b
            mul eax,ebx             // (edx,eax)=a*b
            mov ebx,_fx32_one
            div ebx                 // eax=(a*b)>>_fx32_fract
            mov a,eax;
        }
        return a;
    }
    //---------------------------------------------------------------------------
    int fx32_div(const int &x,const int &y) // x/y
    {
        int a=x,b=y;                // asm has access only to local variables
        asm
        {                       // compute (a*b)>>_fx32_fract
            mov eax,a
            mov ebx,_fx32_one
            mul eax,ebx             // (edx,eax)=a<<_fx32_fract
            mov ebx,b
            div ebx                 // eax=(a<<_fx32_fract)/b
            mov a,eax;
        }
        return a;
    }
    //---------------------------------------------------------------------------
    int fx32_abs_sqrt(int x)            // |x|^(0.5)
    {
        int m,a;
        if (!x)
            return 0;
        if (x<0)
            x=-x;
        m=bits(x);                  // integer bits
        for (a=x,m=0;a;a>>=1,m++);  // count all bits
        m-=_fx32_fract_bits;        // compute result integer bits (half of x integer bits)
        if (m<0)
            m=0;
        m>>=1;
        m=_fx32_one<<m;             // MSB of result mask
        for (a=0;m;m>>=1)           // test bits from MSB to 0
        {
            a|=m;                   // bit set
            if (fx32_mul(a,a)>x)    // if result is too big
                a^=m;                   // bit clear
        }
        return a;
    }
    //---------------------------------------------------------------------------
    int fx32_pow(int x,int y)       // x^y
    {
        // handle special cases
        if (!y)
            return _fx32_one;                           // x^0 = 1
        if (!x)
            return 0;                                   // 0^y = 0  if y!=0
        if (y==-_fx32_one)
            return fx32_div(_fx32_one,x);   // x^-1 = 1/x
        if (y==+_fx32_one)
            return x;                       // x^+1 = x
        int m,a,b,_y;
        int sx,sy;
        // handle the signs
        sx=0;
        if (x<0)
        {
            sx=1;
            x=-x;
        }
        sy=0;
        if (y<0)
        {
            sy=1;
            y=-y;
        }
        _y=y&_fx32_fract_mask;      // _y fractional part of exponent
         y=y&_fx32_integ_mask;      //  y integer part of exponent
        a=_fx32_one;                // ini result
        // powering by squaring x^y
        if (y)
        {
            for (m=_fx32_uMSB_mask;(m>_fx32_one)&&(m>y);m>>=1);     // find mask of highest bit of exponent
            for (;m>=_fx32_one;m>>=1)
            {
                a=fx32_mul(a,a);
                if (int(y&m))
                    a=fx32_mul(a,x);
            }
        }
        // powering by rooting x^_y
        if (_y)
        {
            for (b=x,m=_fx32_one>>1;m;m>>=1)                            // use only fractional part
            {
                b=fx32_abs_sqrt(b);
                if (int(_y&m))
                    a=fx32_mul(a,b);
            }
        }
        // handle signs
        if (sy)
        {
            if (a)
                a=fx32_div(_fx32_one,a);
            else
                a=0; /*Error*/
        }   // underflow
        if (sx)
        {
            if (_y)
                a=0; /*Error*/
            else if(int(y&_fx32_one))
                a=-a;
        }   // negative number ^ non integer exponent, here could add test if 1/_y is integer instead
        return a;
    }
    //---------------------------------------------------------------------------
    

    I have tested it like this:

    float a,b,c0,c1,d;
    int x,y;
    for (a=0.0,x=fx32_set(a);a<=10.0;a+=0.1,x=fx32_set(a))
        for (b=-2.5,y=fx32_set(b);b<=2.5;b+=0.1,y=fx32_set(b))
        {
            if (!x)
                continue; // math pow has problems with this
            if (!y)
                continue; // math pow has problems with this
            c0=pow(a,b);
            c1=fx32_get(fx32_pow(x,y));
            d=0.0;
            if (fabs(c1)<1e-3)
                d=c1-c0;
            else
                d=(c0/c1)-1.0;
            if (fabs(d)>0.1)
                d=d; // here add breakpoint to check inconsistencies with math pow
        }
    
    • a,b are floating point
    • x,y are closest fixed point representations of a,b
    • c0 is math pow result
    • c1 is fx32_pow result
    • d is difference

    I hope did not forget something trivial but it seems like it works properly. Do not forget that fixed point has very limited precision so the results will differ a bit ...

    P.S. Take a look at this: