Search code examples
pythonalgorithmnumpylinear-programmingleast-squares

Least square on linear N-way-equal problem


Suppose I want to find the "intersection point" of 2 arbitrary high-dimensional lines. The two lines won't actually intersect, but I still want to find the most intersect point (i.e. a point that is as close to all lines as possible).

Suppose those lines have direction vectors A, B and initial points C, D, I can find the most intersect point by simply set up a linear least square problem: converting the line-intersection equation

Ax + C = By + D

to least-square form

[A, -B] @ [[x, y]] = D - C 

where @ standards for matrix times vector, and then I can use e.g. np.linalg.lstsq to solve it.

But how can I find the "most intersect point" of 3 or more arbitrary lines? If I follow the same rule, I now have

Ax + D = By + E = Cz + F

The only way I can think of is decomposing this into three equations:

Ax + D = By + E
Ax + D = Cz + F
By + E = Cz + F

and converting them to least-square form

[A, -B,  0]                 [E - D]
[A,  0, -C] @ [[x, y, z]] = [F - D]
[0,  B, -C]                 [F - E]

The problem is the size of the least-square problem increases quadraticly about the number of lines. I'm wondering are there more efficient way to solve n-way-equal least-square linear problem?

I was thinking about the necessity of By + E = Cz + F above providing the other two terms. But since this problem do not have exact solution (i.e. they don't actually intersect), I believe doing so will create more "weight" on some variable?

Thank you for your help!

EDIT

I just tested pairing the first term with all other terms in the n-way-equality (and no other pairs) using the following code

def lineIntersect(k, b):
    "k, b: N-by-D matrices describing N D-dimensional lines: k[i] * x + b[i]"

    # Convert the problem to least-square form `Ax = B`
    # A is temporarily defined 3-dimensional for convenience
    A = np.zeros((len(k)-1, k.shape[1], len(k)), k.dtype)
    A[:,:,0] = k[0]
    A[range(len(k)-1),:,range(1,len(k))] = -k[1:]

    # Convert to 2-dimensional matrix by flattening first two dimensions
    A = A.reshape(-1, len(k))

    # B should be 1-dimensional vector
    B = (b[1:] - b[0]).ravel()

    x = np.linalg.lstsq(A, B, None)[0]

    return (x[:,None] * k + b).mean(0)

The result below indicates doing so is not correct because the first term in the n-way-equality is "weighted differently".

The first output is difference between the regular result and the result of different input order (line order should not matter) where the first term did not change.

The second output is the same with the first term did change.

k = np.random.rand(10, 100)
b = np.random.rand(10, 100)
print(np.linalg.norm(lineIntersect(k, b) - lineIntersect(np.r_[k[:1],k[:0:-1]], np.r_[b[:1],b[:0:-1]])))
print(np.linalg.norm(lineIntersect(k, b) - lineIntersect(k[::-1], b[::-1])))

results in

7.889616961715915e-16
0.10702479853076755

Solution

  • Another criterion for the 'almost intersection point' would be a point x such that the sum of the squares of the distances of x to the lines is as small as possible. Like your criterion, if the lines actually do intersect then the almost intersection point will be the actual intersection point. However I think the sum of distances squared criterion makes it straightforward to compute the point in question:

    Suppose we represent a line by a point and a unit vector along the line. So if a line is represented by p,t then the points on the line are of the form

    p + l*t for scalar l
    

    The distance-squared of a point x from a line p,t is

    (x-p)'*(x-p) - square( t'*(x-p))
    

    If we have N lines p[i],t[i] then the sum of the distances squared from a point x is

       Sum { (x-p[i])'*(x-p[i]) - square( t[i]'*(x[i]-p[i]))}
    

    Expanding this out I get the above to be

    x'*S*x - 2*x'*V + K
    

    where

    S = N*I - Sum{ t[i]*t[i]'}
    V = Sum{ p[i] - (t[i]'*p[i])*t[i] }
    

    and K does not depend on x

    Unless all the lines are parallel, S will be (strictly) positive definite and hence invertible, and in that case our sum of distances squared is

    (x-inv(S)*V)'*S*(x-inv(S)*V) + K - V'*inv(S)*V
    

    Thus the minimising x is

    inv(S)*V
    

    So the drill is: normalise your 'direction vectors' (and scale each point by the same factor as used to scale the direction), form S and V as above, solve

    S*x = V for x