Search code examples
cmathfloating-pointmodulusc89

How do I implement a modulus operator for double variables using frexp?


I am following Kernighan&Pike "The UNIX Programming Environment".

An exercise from the book (Exercise 8-2, page 241) asks implementing the modulo operator (%) for double variables in C.

So:

4.6 % 2.1 = 0.4
4.0 % 3.0 = 1.0

Therefore it is basically implementing dmod using frexp:

dmod(4.6, 2.1) would return 0.4
dmod(4,0, 3.0) would return 1.0

I have seen this post: implementing a modulus on fixed point type which defines an algorithm to implement this operator.

But the book suggests as a hint to read frexp(3), so I guess it is possible to do it using that function.

Now if I understood the man page correctly, that function does things like (pseudocode):

a,b -- double variables
a_exp,b_exp -- integer exponents for frexp
a_x = frexp(a,&a_exp) --> a = a_x * 2^a_exp
b_x = frexp(b,&b_exp) --> b = b_x * 2^b_exp
c=a/b
c_exp -- integer exponent for frexp
c_x = frexp(c,&c_exp) --> c = c_x * 2^c_exp

But I still can not figure out how to mix those values to get the modulus operator.

The book is old and there are probably better ways of doing it, but the question is more academic and still valid to understand how to implement this with frexp.


Solution

  • I do not know what specification the authors assumed for modulo of floating-point numbers. I assume here that they are referring to the functionality of the standard C library functionfmod().

    The simplest way of implementing fmod() is to use binary longhand division which produces the quotient of the division in a loop that produces one quotient bit per iteration. Repeat until all integer bits of the quotient have been exhausted, while retaining the partial remainder. At the end of the process, the final remainder represents desired result.

    To start the longhand division, we have to properly align the divisor with the dividend at the start. This is achieved by scaling such that dividend >= divisor > dividend/2. The use of frexp() in conjunction with ldexp() provides a rough scaling based on the exponents which may have to be refined based on the significands (mantissas).

    An exemplary ISO-C99 implementation of fmod()is shown below. Implementation of remainder() would look similar but a bit more complicated due to the requirement to round the quotient to nearest-or-even, rather than truncating it.

    #include <stdio.h>
    #include <stdlib.h>
    #include <stdint.h>
    #include <string.h>
    #include <math.h>
    
    /* returns the floating-point remainder of a/b (rounded towards zero) */
    double my_fmod (double a, double b)
    {
        const double NAN_INDEFINITE = 0.0 / 0.0;
        double r;
        if (isnan (a) || isnan (b)) {
            r = a + b;
        } else if (isinf (a) || (b == 0.0)) {
            r = NAN_INDEFINITE;
        } else {
            double fa, fb, dividend, divisor;
            int expo_a, expo_b;
            fa = fabs (a);
            fb = fabs (b);
            if (fa >= fb) {
                dividend = fa;
                /* normalize divisor */
                (void)frexp (fa, &expo_a);
                (void)frexp (fb, &expo_b);
                divisor = ldexp (fb, expo_a - expo_b);
                if (divisor <= 0.5 * dividend) {
                    divisor += divisor;
                }
                /* compute quotient one bit at a time */
                while (divisor >= fb) {
                    if (dividend >= divisor) {
                        dividend -= divisor;
                    }
                    divisor *= 0.5;
                }
                /* dividend now represents remainder */
                r = copysign (dividend, a);
            } else {
                r = a;
            }
        }
        return r;
    }
    
    /*
      From: geo <[email protected]>
      Newsgroups: sci.math,comp.lang.c,comp.lang.fortran
      Subject: 64-bit KISS RNGs
      Date: Sat, 28 Feb 2009 04:30:48 -0800 (PST)
    
      This 64-bit KISS RNG has three components, each nearly
      good enough to serve alone.    The components are:
      Multiply-With-Carry (MWC), period (2^121+2^63-1)
      Xorshift (XSH), period 2^64-1
      Congruential (CNG), period 2^64
    */
    
    static uint64_t kiss64_x = 1234567890987654321ULL;
    static uint64_t kiss64_c = 123456123456123456ULL;
    static uint64_t kiss64_y = 362436362436362436ULL;
    static uint64_t kiss64_z = 1066149217761810ULL;
    static uint64_t kiss64_t;
    
    #define MWC64  (kiss64_t = (kiss64_x << 58) + kiss64_c, \
                    kiss64_c = (kiss64_x >> 6), kiss64_x += kiss64_t, \
                    kiss64_c += (kiss64_x < kiss64_t), kiss64_x)
    #define XSH64  (kiss64_y ^= (kiss64_y << 13), kiss64_y ^= (kiss64_y >> 17), \
                    kiss64_y ^= (kiss64_y << 43))
    #define CNG64  (kiss64_z = 6906969069ULL * kiss64_z + 1234567ULL)
    #define KISS64 (MWC64 + XSH64 + CNG64)
    
    double int64_as_double (int64_t a)
    {
        double r;
        memcpy (&r, &a, sizeof r);
        return r;
    }
    
    int32_t double_as_int64 (double a)
    {
        int64_t r;
        memcpy (&r, &a, sizeof r);
        return r;
    }
    
    int main (void)
    {
        double a, b, res, ref;
        uint64_t i = 0;
        do {
            a = int64_as_double (KISS64);
            b = int64_as_double (KISS64);
            ref = fmod (a, b);
            res = my_fmod (a, b);
            if (double_as_int64 (res) != double_as_int64 (ref)) {
                printf ("error: a=% 23.16e b=% 23.16e res=% 23.16e ref=% 23.16e\n", a, b, res, ref);
                return EXIT_FAILURE;
            }
            i++;
            if (!(i & 0xfffff)) printf ("\r%llu", i);
        } while (i);
        return EXIT_SUCCESS;
    }