I have the following functions for computing the square root of a number using the Newton-Raphson method.
double get_dx(double y, double x)
{
return (x*x - y)/(2*x);
}
double my_sqrt(double y, double initial)
{
double tolerance = 1.0E-6;
double x = initial;
double dx;
int iteration_count = 0;
while ( iteration_count < 100 &&
fabs(dx = get_dx(y, x)) > tolerance )
{
x -= dx;
++iteration_count;
if ( iteration_count > 90 )
{
printf("Iteration number: %d, dx: %lf\n", iteration_count, dx);
}
}
if ( iteration_count < 100 )
{
printf("Got the result to converge in %d iterations.\n", iteration_count);
}
else
{
printf("Could not get the result to converge.\n");
}
return x;
}
They work for most numbers. However, they don't converge when y
is 1.0E21
and 1.0E23
. There might be other numbers, that I don't know of yet, for which the functions don't converge.
I tested with initial values of:
y*0.5
-- something to get started with.1.0E10
-- a number in the neighborhood of the final result.sqrt(y)+1.0
-- A very close number to the final answer, obviously.In all the cases the functions failed to converge for 1.0E21
and 1.0E23
. I tried a lower number, 1.0E19
, and a higher number, 1.0E25
. Neither of them is a problem.
Here's a complete program:
#include <stdio.h>
#include <math.h>
double get_dx(double y, double x)
{
return (x*x - y)/(2*x);
}
double my_sqrt(double y, double initial)
{
double tolerance = 1.0E-6;
double x = initial;
double dx;
int iteration_count = 0;
while ( iteration_count < 100 &&
fabs(dx = get_dx(y, x)) > tolerance )
{
x -= dx;
++iteration_count;
if ( iteration_count > 90 )
{
printf("Iteration number: %d, dx: %lf\n", iteration_count, dx);
}
}
if ( iteration_count < 100 )
{
printf("Got the result to converge in %d iterations.\n", iteration_count);
}
else
{
printf("Could not get the result to converge.\n");
}
return x;
}
void test(double y)
{
double ans = my_sqrt(y, 0.5*y);
printf("sqrt of %lg: %lg\n\n", y, ans);
ans = my_sqrt(y, 1.0E10);
printf("sqrt of %lg: %lg\n\n", y, ans);
ans = my_sqrt(y, sqrt(y) + 1.0);
printf("sqrt of %lg: %lg\n\n", y, ans);
}
int main()
{
test(1.0E21);
test(1.0E23);
test(1.0E19);
test(1.0E25);
}
and here's the output (built on 64 bit Linux, using gcc 4.8.4):
Iteration number: 91, dx: -0.000002
Iteration number: 92, dx: 0.000002
Iteration number: 93, dx: -0.000002
Iteration number: 94, dx: 0.000002
Iteration number: 95, dx: -0.000002
Iteration number: 96, dx: 0.000002
Iteration number: 97, dx: -0.000002
Iteration number: 98, dx: 0.000002
Iteration number: 99, dx: -0.000002
Iteration number: 100, dx: 0.000002
Could not get the result to converge.
sqrt of 1e+21: 3.16228e+10
Iteration number: 91, dx: 0.000002
Iteration number: 92, dx: -0.000002
Iteration number: 93, dx: 0.000002
Iteration number: 94, dx: -0.000002
Iteration number: 95, dx: 0.000002
Iteration number: 96, dx: -0.000002
Iteration number: 97, dx: 0.000002
Iteration number: 98, dx: -0.000002
Iteration number: 99, dx: 0.000002
Iteration number: 100, dx: -0.000002
Could not get the result to converge.
sqrt of 1e+21: 3.16228e+10
Iteration number: 91, dx: 0.000002
Iteration number: 92, dx: -0.000002
Iteration number: 93, dx: 0.000002
Iteration number: 94, dx: -0.000002
Iteration number: 95, dx: 0.000002
Iteration number: 96, dx: -0.000002
Iteration number: 97, dx: 0.000002
Iteration number: 98, dx: -0.000002
Iteration number: 99, dx: 0.000002
Iteration number: 100, dx: -0.000002
Could not get the result to converge.
sqrt of 1e+21: 3.16228e+10
Iteration number: 91, dx: 0.000027
Iteration number: 92, dx: 0.000027
Iteration number: 93, dx: 0.000027
Iteration number: 94, dx: 0.000027
Iteration number: 95, dx: 0.000027
Iteration number: 96, dx: 0.000027
Iteration number: 97, dx: 0.000027
Iteration number: 98, dx: 0.000027
Iteration number: 99, dx: 0.000027
Iteration number: 100, dx: 0.000027
Could not get the result to converge.
sqrt of 1e+23: 3.16228e+11
Iteration number: 91, dx: -0.000027
Iteration number: 92, dx: -0.000027
Iteration number: 93, dx: -0.000027
Iteration number: 94, dx: -0.000027
Iteration number: 95, dx: -0.000027
Iteration number: 96, dx: -0.000027
Iteration number: 97, dx: -0.000027
Iteration number: 98, dx: -0.000027
Iteration number: 99, dx: -0.000027
Iteration number: 100, dx: -0.000027
Could not get the result to converge.
sqrt of 1e+23: 3.16228e+11
Iteration number: 91, dx: 0.000027
Iteration number: 92, dx: 0.000027
Iteration number: 93, dx: 0.000027
Iteration number: 94, dx: 0.000027
Iteration number: 95, dx: 0.000027
Iteration number: 96, dx: 0.000027
Iteration number: 97, dx: 0.000027
Iteration number: 98, dx: 0.000027
Iteration number: 99, dx: 0.000027
Iteration number: 100, dx: 0.000027
Could not get the result to converge.
sqrt of 1e+23: 3.16228e+11
Got the result to converge in 35 iterations.
sqrt of 1e+19: 3.16228e+09
Got the result to converge in 6 iterations.
sqrt of 1e+19: 3.16228e+09
Got the result to converge in 1 iterations.
sqrt of 1e+19: 3.16228e+09
Got the result to converge in 45 iterations.
sqrt of 1e+25: 3.16228e+12
Got the result to converge in 13 iterations.
sqrt of 1e+25: 3.16228e+12
Got the result to converge in 1 iterations.
sqrt of 1e+25: 3.16228e+12
Can anyone explain why the above functions don't converge for 1.0E21
and 1.0E23
?
This answer specifically explains why the algorithm fails to converge for an input of 1e23.
The issue that you're facing is known as the "small difference of large numbers". Specifically, you're computing (x*x - y)/(2*x)
where y
is 1e23, and x
is approximately 3.16e11.
The IEEE-754 encoding for 1e23 is 0x44b52d02c7e14af6
. Hence the exponent is 0x44b, which is 1099 decimal. That has to be reduced by 1023, since the exponent is encoded as offset-binary. Then you have to subtract another 52 to find the weight of an LSB.
0x44b ==> 1099 - 1023 - 52 = 24 ==> 1 LSB = 2^24
So a single LSB has the value 16777216.
The result from this code
double y = 1e23;
double x = sqrt(y);
double dx = x*x - y;
printf( "%lf\n", dx );
is in fact -16777216.
As a result, when you compute x*x - y
, the result is either going to be zero, or it's going to be a multiple of 16777216. Dividing 16777216 by 2*x
which is 2*3.16e11
gives an error of 0.000027, which is not within your tolerance.
The two closest values to the sqrt(y)
are
0x4252682931c035a0
0x4252682931c035a1
and when you square those numbers you get
0x44b52d02c7e14af5
0x44b52d02c7e14af7
So neither one matches the desired result, which is
0x44b52d02c7e14af6
and hence the algorithm can never converge.
The reason the algorithm works for 1e25 is due to luck. The encoding for 1e25 is
0x45208b2a2c280291
The encoding for the sqrt(1e25)
is
0x428702337e304309
and when you square that number, you get
0x45208b2a2c280291
Hence x*x - y
is exactly 0, and the algorithm gets lucky.