I stumbled across an interesting optimization question while writing some mathematical derivative functions for a neural network library. Turns out that the expression a / (b*c)
takes longer to compute than a / b / c
for large values (see timeit
below). But since the two expressions are equal:
a / (b*c)
, given that it seems to be slower?Thanks in advance :)
In [2]: timeit.timeit('1293579283509136012369019234623462346423623462346342635610 / (52346234623632464236234624362436234612830128521357*32189512234623462637501237)')
Out[2]: 0.2646541080002862
In [3]: timeit.timeit('1293579283509136012369019234623462346423623462346342635610 / 52346234623632464236234624362436234612830128521357 / 32189512234623462637501237')
Out[3]: 0.008390166000026511
Why is a/(b*c)
slower?
(b*c)
is multiplying two very big ints with unlimited precision. That is a more expensive operation than performing floating point division (which has limited precision).
Are the two calculations equivalent?
In practice, a/(b*c)
and a/b/c
can give different results, because floating point calculations have inaccuracies, and doing the operations in a different order can produce a different result.
For example:
a = 10 ** 33
b = 10000000000000002
c = 10 ** 17
print(a / b / c) # 0.9999999999999999
print(a / (b * c)) # 0.9999999999999998
It boils down to how a computer deals with the numbers it uses.
Why doesn't Python calculate a/(b*c)
as a/b/c
?
That would give surprising results. The user ought to be able to expect that
d = b*c
a / d
should have the same result as
a / (b*c)
so it would be a source of very mysterious behaviour if a / (b*c)
gave a different result because it was magically replaced by a / b / c
.