Search code examples
c++algorithmtheorymodulusfixed-point

implementing a modulus on fixed point type


so currently I'm helping develop a programming language, and we've reached the point where we have to implement a fixed point type using C++ as our backbone language to write this in. I am able to figure out how to add, subtract, multiply, divide, however I am drawing a blank for how to accomplish this for modulus and exponentials. I feel I am close to figuring out exponentials, so I will focus this question on modulus.

Currently what I do is I take in a string literal and track the point of the radix and find the distance of that from the end of the string, and this gives me my scaling factor. After that we keep the entire big integer, which is supposed to be constrained to a fixed type (unless circumstances like division...or possibly modulus, arise, at which point we increase the fractional side of the radix to atleast half of the size of the integer portion). I have implemented ways to grab values to the left of the radix and to the right of the radix by multiplying by factors of 10 (I want the accuracy that comes with base 10 as opposed to the speed of base 2) to get the place where the radix will sit. I have googled looking up how to implement modulus on fixed point types, to no avail.

I can't seem to figure out how one would go about implementing it. Is there a way of implementing this? Do people know of a generalized algorithm? Anything to point me in the right direction? Remember, I'm aiming for accuracy. Speed is nice too, but the prime directive is accuracy.

To clarify, the hardware we would be operating on is generalized. As for the definition of what I want...that's somewhat why I'm here. I don't know what I want, I want to know some examples and different options to choose from. Really I'm just trying to learn about this.

EXAMPLE:

say I have a fixed8x8 and I push out 2 % 1.2 (2 could be 2.0 as well here), the result should come back 0.8. What's a good rule for going about extending the size of the right side to compensate for accuracy?


Solution

  • With the correction of 1.2 % 2 == 1.2, then just ignore the decimal, e.g. this is the same as 12 % 20 == 12

    You can just use subtraction to compute this, if speed is truly not necessary. E.g. for any A % M, just keep subtracting M from A until A < M, for M != 0 (infinite loop!). This will provide the same result, 1.2, since subtraction really isn't affected by decimal points when represented as fixed-point.

    If speed matters, and you don't have multiplication routines that can ignore the decimal (most won't), you may need to build them, since at 8x8 you won't have enough space to shift the entire value into the left-side (for some operations, I implemented 24x8 fixed point registers in my PIC 16F library to allow higher precision when performing computations on 16x8 fixed point).

    To elaborate on using multiplication for faster arithmetic, and the reason you would need multiplication routines that can ignore the decimal,

    First we analyze the slower subtraction approach using this pseudo-C++ code:

    auto answer = A;
    while(answer >= M)
      answer -= M;
    // now, your answer is in answer
    

    But if we knew how many iterations were necessary beforehand, we could just:

    auto answer = A - M * number_of_iterations;
    

    Of course, we won't know that, so the goal is to reduce the number of iterations that we need rather. This can be done with reverse-factorization - given a number M, find how many times you can multiply it by 10 (one digit to the left) and it still be less than A, so e.g. if A is 1023.32 and M is 4.1, you can multiply M by 10 twice, resulting in M*10*10 = 410. Subtract this from A until A < M*10*10, leaving A'=203.32, and continue to repeat until your working copy of A is less than M. This turns approach generally turns an O(N) operation into an O(log(N)) operation, provided you don't add too much overhead with the computations (in my PIC 16F implementation, I had to be careful because the chip only had 384 bytes of total memory, so needless to say, it was slower than it could have been).

    I mostly did my work in base-2, but the same ideas translate to base-10 as well, so something like this should greatly reduce the number of iterations, especially for larger items (and this can be adjusted depending on your internal representation, so you can use digit-shifting rather than multiplication for each step):

    auto currentA = A;
    while(currentA >= M){
      auto scaledM = M;
      while(scaledM*10 < currentA)
        scaledM *= 10;
      // this second loop prevents recomputing scaledM where
      // currentA > n*scaledM for some n < 10; not needed in base-2
      while(currentA > scaledM)
        currentA -= scaledM;
    }
    

    And currentA will contain your modulus when this completes.