Search code examples
pythonfractionsmpfrgmpy

Python multi-precision rational comparison: Fraction, mpq and mpfr


I understand that floating-point calculation is not accurate due to its nature. I'm trying to find out the best library/way to do multi-precision ration comparison. I'm comparing Fraction, mpq and mpfr. The later two are from gmpy2 library. The first one is from fractions package. I'm using python3.3

This is the script I used to compare. Not very well written, a very simple one.

from fractions import Fraction
from gmpy2 import mpq, mpfr
import time

# This script compares gmpy2 library and Fraction library

total_pass_mpq = 0
total_pass_mpfr = 0
total_pass_frc = 0

a = mpq("-3.232429")
a_ = Fraction("-3.232429")
a__ = mpfr("-3.232429")
if str(float(a)) == "-3.232429":
    total_pass_mpq +=1
if str(float(a_)) == "-3.232429":
    total_pass_frc += 1
if str(float(a__)) == "-3.232429":
    total_pass_mpfr += 1

b = mpq("604.08")
c = mpq("1.979")
b_ = Fraction("604.08")
c_ = Fraction("1.979")
b__ = mpfr("604.08")
c__ = mpfr("1.979")
if str(float(b*c)) == "1195.47432":
    total_pass_mpq += 1
if str(float(b_*c_)) == "1195.47432":
    total_pass_frc += 1
if str(float(b__*c__)) == "1195.47432":
    total_pass_mpfr += 1

d = mpq(604.08)
e = mpq(1.979)
d_ = Fraction(604.08)
e_ = Fraction(1.979)
d__ = mpfr(604.08)
e__ = mpfr(1.979)
if str(float(d*e)) == "1195.47432":
    total_pass_mpq += 1
if str(float(d_*e_)) == "1195.47432":
    total_pass_frc += 1
if str(float(d__*e__)) == "1195.47432":
    total_pass_mpfr += 1

f = mpq(-3.232429)
f_ = Fraction(-3.232429)
f__ = mpfr(-3.232429)
if str(float(f)) == "-3.232429":
    total_pass_mpq +=1
if str(float(f_)) == "-3.232429":
    total_pass_frc += 1
if str(float(f__)) == "-3.232429":
    total_pass_mpfr +=1

g = mpq(503.79)
g_ = Fraction(503.79)
g__ = mpfr(503.79)
h = mpq(0.07)
h_ = Fraction(0.07)
h__ = mpfr(0.07)
if str(float(g*(1+h))) == "539.0553":
    total_pass_mpq += 1
if str(float(g_*(1+h_))) == "539.0553":
    total_pass_frc += 1
if str(float(g__*(1+h__))) == "539.0553":
    total_pass_mpfr += 1

print("Total passed mpq: " + str(total_pass_mpq))
print("Total passed Fraction: " + str(total_pass_frc))
print("Total passed mpfr: " + str(total_pass_mpfr))

start_mpq = time.time()
for i in range(0, 50000):
    y = mpq(0.32329)
    z = mpq(-1)
    yz = y*z
end_mpq = time.time()
print("Time for executing mpq: " + str(end_mpq - start_mpq))

start_frc = time.time()
for j in range(0, 50000):
    y = Fraction(0.32329)
    z = Fraction(-1)
    yz_ = y*z
end_frc = time.time()
print("Time for executing frc: " + str(end_frc - start_frc))

start_frc_2 = time.time()
for j_ in range(0, 50000):
    y = Fraction(0.32329)
    z = Fraction(-1)
    yz_2 = y*z
end_frc_2 = time.time()
print("Time for executing frc str: " + str(end_frc_2 - start_frc_2))

start_mpfr = time.time()
for k in range(0, 50000):
    y = mpfr(0.32329)
    z = mpfr(-1)
    yz__ = y*z
end_mpfr = time.time()
print("Time for executing mpfr: " + str(end_mpfr - start_mpfr))

start_mpfr_2 = time.time()
for k_ in range(0, 50000):
    y = mpfr("0.32329")
    z = mpfr("-1")
    yz__2 = y*z
end_mpfr_2 = time.time()
print("Time for executing mpfr str: " + str(end_mpfr_2 - start_mpfr_2))

This is the result:

Total passed mpq: 3
Total passed Fraction: 5
Total passed mpfr: 4
Time for executing mpq: 0.04700875282287598
Time for executing frc: 2.1327619552612305
Time for executing frc str: 2.0934295654296875
Time for executing mpfr: 0.05441713333129883
Time for executing mpfr str: 0.12844634056091309

So basically I've got the result that Fraction is the most accurate one, but it's super slow. For this question, I wanted to ask,

  1. is there any other case that you think I should also try?
  2. any other library?
  3. If speed matters, is there a way to improve precision using gmpy2 library?

Solution

  • float(mpq) calls the GMP library function mpq_get_q. I checked the GMP source and mpq_get_d rounds the intermediate result towards 0. It does not calculate a correctly rounded result. (In this case, correctly rounded implies round to nearest with ties to even.) So it will occasionally be different than float(Fraction).

    The GMP library is not optimized for floating point calculations. To get correctly rounded floating values, you should use the MFPR libary (aka the mpfr type in gmpy2).

    The most accurate way to convert an mpq to a float is to convert it to an mpfr first. To avoid double-rounding, you should convert from mpq to mpfr with a precision of exactly 53 bits. So float(mpfr(mpq, 53)). (The default precision is currently 53 bits but that might change in the future. Specifying the desired precision, or ensuring the default context's precision is set to 53, is recommended.) This change makes mpq and Fraction return the same results in your example.

    There is still one mpfr result that is different. That is just inherent in the fact that intermediate mpfrcalculations are rounding to the current precision (53 bits in this case).

    Update to answer a question by @mattsun.

    Why does mpfr("503.79")*(mpfr("1")+mpfr("0.07")) not equal "539.0553"?

    Both Python's float type and gmpy2's mpfr type use a binary, or radix-2, representation. We normally use decimal, or radix-10, representation when we work with numbers. Just like 1/3 cannon be represented exactly in decimal arithmetic, most decimal numbers cannot be represented exactly with a binary representation. The calculations are done with values that are close, but not exactly equal, to the values given. The errors can accumulate, and the result will be just a little different than your expected value.

    There are two options:

    1) Format the string to the desired decimal format.

    2) Use the decimal library.

    Disclaimer: I maintain gmpy2.