Python - Closest Minimum

I've 2 matrices both with Nx2 elements. Any value is a float with 8-10 decimals and they represent respectively 'x' and 'y' of a point.

For any element couple (x, y) (x is in the first column, while y is in the second one) in the first array, I'd need to find the closest point in the second one. At any loop, once found, I need to remove that value from the second array.

Finally, my main objective would be to have the optimal solution so that there's a one-to-one mapping between any element of the first array with only one element of the second array, so that the closest value clause is met.

I created a NxN matrix where I computed the distance from any point of first array to any point of the second array via

scipy.spatial.distance.cdist

Code:

def find_nearest(X_start, X_end):
    mat = scipy.spatial.distance.cdist(X_start, X_end, metric='euclidean')
    new_df = pd.DataFrame(mat)
    return new_df;

The next step is to couple a starting point with a ending point and there should NOT be any intersection, that is a mapping one-to-one.

I thought to do it via a Integer programming (using this). So if m[i][j] is an element of the matrix NxN, I found those constraints

The problem is that I don't know how to write the objective function and so I'm note sure if I need to add any other constraint related to it.

Do you think this is a good path to follow? Last question seems to be not appreciated since I did not expose what I already did.

So here it is.

Solution

This is called an assignment problem.

   min sum((i,j), dist[i,j]*x[i,j])
   subject to
       sum(i, x[i,j]) = 1 for all j
       sum(j, x[i,j]) = 1 for all i
       x[i,j] in {0,1}

where

 i = 1..n is an element of the first matrix
 j = 1..n is an element of the second matrix
 dist[i,j] is a distance matrix

These problems can be solved with specialized solvers or it can be formulated as an LP (linear programming) problem.

Scipy has a simple assignment solver (link). This is not a very fast implementation however: a good LP solver is faster (link).