I was trying to verify the mean distance between 2 points in various 3-D and 2-D structures by taking average of multiple random cases. Almost all the time, I was getting a pretty good accuracy except for the case of points on the surface of a sphere. My code uses Gaussian distribution inspired from this answer (see the second most up voted answer)
Here is the python code:
import math as m
from random import uniform as u
sum = 0
for i in range(10000):
x1 = u(-1, 1)
y1 = u(-1, 1)
x2 = u(-1, 1)
y2 = u(-1, 1)
z1 = u(-1, 1)
z2 = u(-1, 1)
if x1 == y1 == z1 == 0:
sum += m.sqrt((x2) ** 2 + (y2) ** 2 + (z2) ** 2)
elif x2 == y2 == z2 == 0:
sum += m.sqrt((x1) ** 2 + (y1) ** 2 + (z1) ** 2)
else:
x1 /= m.sqrt(x1 ** 2 + y1 ** 2 + z1 ** 2)
y1 /= m.sqrt(x1 ** 2 + y1 ** 2 + z1 ** 2)
z1 /= m.sqrt(x1 ** 2 + y1 ** 2 + z1 ** 2)
x2 /= m.sqrt(x2 ** 2 + y2 ** 2 + z2 ** 2)
y2 /= m.sqrt(x2 ** 2 + y2 ** 2 + z2 ** 2)
z2 /= m.sqrt(x2 ** 2 + y2 ** 2 + z2 ** 2)
sum += m.sqrt((x1 - x2) ** 2 + (y1 - y2) ** 2 + (z1-z2) ** 2)
print(sum/10000)
The expected value is 4/3 which is shown here
Arguably the absolute difference is not very large. But the percentage deviation from expected value is around 1% on any run. On the other hand, in all other similar programs with other shapes and same number of random cases, the % deviation is around 0.05% on average.
Also, the value that the code returns is always less than 4/3. This is my major concern.
My guess is that I have implemented the algorithm in a wrong way. Any help is appreciated.
Edit:
After realizing the mistake made in the previous method, I now, first use rejection sampling to get the points lying inside the sphere. This will ensure that after dividing the point vectors with their norms, the resulting unit vector distribution will be uniform. In spite of doing that, I am getting a different result, which is unexpectedly more deviated from expected than the previous one. To be more precise, the limit approaches 1.25 with this algorithm.
Here is the code:
sum2 = 0
size = 0
for t in range(10000): # Attempt 2
x1 = u(-1, 1)
y1 = u(-1, 1)
x2 = u(-1, 1)
y2 = u(-1, 1)
z1 = u(-1, 1)
z2 = u(-1, 1)
if (x1**2 + y1**2 + z1**2)>1 or (x2**2 + y2**2 + z2**2)>1 or x1==y1==z1==0 or x2==y2==z2==0: continue
size += 1
x1 /= m.sqrt(x1 ** 2 + y1 ** 2 + z1 ** 2)
y1 /= m.sqrt(x1 ** 2 + y1 ** 2 + z1 ** 2)
z1 /= m.sqrt(x1 ** 2 + y1 ** 2 + z1 ** 2)
x2 /= m.sqrt(x2 ** 2 + y2 ** 2 + z2 ** 2)
y2 /= m.sqrt(x2 ** 2 + y2 ** 2 + z2 ** 2)
z2 /= m.sqrt(x2 ** 2 + y2 ** 2 + z2 ** 2)
sum2 += m.sqrt((x1 - x2) ** 2 + (y1 - y2) ** 2 + (z1 - z2) ** 2)
print(size)
print(sum2/size)
The initial random values for the two points are contained within a cube, rather than a sphere. After scaling each vector by 1/length, the vectors are on the unit sphere, but they are not evenly distributed across the surface of the sphere.
You will tend to get more vectors near the corners of the cube, compared to the centre of each face. Since the vectors tend to cluster in regions, the average value of the distance between them is less than 4/3.
This will do the trick: https://mathworld.wolfram.com/SpherePointPicking.html
This code works for me:
from math import sqrt
from random import uniform
sum2 = 0
size = 0
while size < 100000:
x1 = uniform(-1, 1)
y1 = uniform(-1, 1)
x2 = uniform(-1, 1)
y2 = uniform(-1, 1)
z1 = uniform(-1, 1)
z2 = uniform(-1, 1)
r1 = sqrt(x1**2 + y1**2 + z1**2)
r2 = sqrt(x2**2 + y2**2 + z2**2)
if r1 > 1 or r2 > 1 or x1==y1==z1==0 or x2==y2==z2==0: continue
size += 1
x1 /= r1
y1 /= r1
z1 /= r1
x2 /= r2
y2 /= r2
z2 /= r2
sum2 += sqrt((x1 - x2) ** 2 + (y1 - y2) ** 2 + (z1 - z2) ** 2)
print(sum2/size)
Output was:
1.3337880809331075