How to improve my TSP implementation ? (Using Backtracking) (Python)

I'm trying to code a recursive implementation of the TSP problem using backtracking.

Took a few tries, but I've just finished it and it seems to work.

However it's a little bit slow so i'm trying to figure out how to improve it.


def search_best_route(distance_matrix):

    best_route = [None] * (len(distance_matrix[0]))
    route = best_route.copy()
    visited = [False] * (len(distance_matrix[0]))
    current_distance = 0
    return search_routes(0, route, best_route, -1,  -1, current_distance, distance_matrix, visited, 0)


def search_routes(start_index1, route, best_route, min_distance, best_distance, current_distance, distance_matrix, visited, last_visited):

    if start_index1 == len(route):

        current_distance += distance_matrix[last_visited][0]

        if current_distance < best_distance or best_distance == -1:

            best_distance = current_distance

            for k in range(0, len(route)):

                best_route[k] = route[k]

    else:

        for k in range(len(route)):

            if not visited[k]:

                new_min = min_distance - distance_matrix[last_visited][k]

                if new_min < best_distance or start_index1 == 0:

                    new_current_distance = current_distance + distance_matrix[last_visited][k]
                    route[start_index1] = k
                    visited[k] = True
                    best_distance = search_routes(start_index1 + 1, route, best_route, new_min, best_distance,
                                                  new_current_distance, distance_matrix, visited, k)[1]
                    visited[k] = False

    return best_route, best_distance


def build_distance_matrix(x, y):

    distance_matrix = [[0 for col in range(len(x))] for row in range(len(x))]

    for i in range(len(x)):

        for j in range(len(x)):

            distance_matrix[i][j] = ((x[i] - x[j]) ** 2 + (y[i] - y[j]) ** 2) ** 0.5

    return distance_matrix


n = int(input())
x = [None] * n
y = x.copy()

for i in range(n):

    x_y = input()
    x[i] = x_y.split()[0]
    y[i] = x_y.split()[1]

x = list(map(float, x))
y = list(map(float, y))

best_route = search_best_route(build_distance_matrix(x, y))
print("%.4f" % best_route[1])
print(' '.join(map(str, best_route[0])))

Sorry if i did something extremely wrong and you're dying inside watching this haha.

Solution

A minor change that would probably help is that you're iterating over the entire route list at each step. You could save a factor of n by maintaining a list of points yet to be visited, the way that recursive permutation enumeration algorithms do.

On paper, a factor of n might seem like a lot, but since the algorithm is still Θ(n!), the improvement amounts to one more point in the same running time. To do significantly better, you'll need to implement branch and bound.

Basic branch and bound isn't difficult to implement, since recursive backtracking is already most of the work for the "branch" part. The easiest bounding strategy is checking whether the distance so far exceeds the current minimum and, if so, declining to explore the branch. This works better if you explore the most promising branches first (e.g., order the candidate next points by increasing distance from the current point).

One step up in complexity is to use the 1-tree bound. Given a path on some of the points, we want to estimate the cost of extending it to a tour without going over. Every such extension consists of an edge from the last point in the given path to some unvisited point, a path on all of the unvisited points, and an edge back from some unvisited point to the first point in the given path. Getting the min cost of the path on unvisited points is hard -- that's basically TSP. We can lowerbound it, however, by computing the cost of a minimum spanning tree on those points, since a path is a spanning tree. Our final estimate is the sum of

the cost of the current path,
the distance from the first point in the current path to the closest unvisited point,
the distance from the last point in the current path to the closest unvisited point,
the cost of the minimum spanning tree on unvisited points.

From there, depending on your fluency with math, there's a whole literature on improvements for exact TSP algorithms.