nodes graph-algorithm matching networkx branch-and-bound

Best matches of nodes using branch and bound

The idea for this problem is to explore all nodes and all their neighbors of a undirected graph.

Each union has an associated weight. The idea is to make the maximum possible pairing with minimal weight.

Considering that once the pair created, you can not join more. To implement this algorithm, using branch and bound, I have to find an upper bound and a lower bound.

My idea is to have a list of solutions and a partial list where I introduced all possible pairs of neighbors.

Then compare if the cost is less, add to the list of solutions.

The problem is I do not know that heuristics used to achieve these bounds

Any idea in pseudocode?

Sincerely

Edit

I'm sorry I left so open problem. Let me explain the problem an image.

In the image, three unions observed. (1,3), (2,5), (6,4).

These are optimal unions.

For the optimal solution, it must be satisfied that the weight of the union of a pair (x, y) is the smallest and the "x" and "y" will never be matched again

A condition which I thought was: return the list solutions, with all matches if the following condition is met:

len (G.nodes ()) / 2 == len (solutions)

I did this using a greedy algorithm.

Iterating on all nodes and joining the neighbor that has not been visiting and has the least weight.

But this way, I can not guarantee optimality.

Solution

As Abdallah Sobehy writes in his comment to you, this is more of an open discussion rather than a question, and possible (in its current form) not suitable for SE. I will, however, give it a shot, and post it here rather than in comments due to its length.

For this discussion, lets denote your complete undirected graph as G(N, E), with set of nodes and set of edges E. We shall also assume that the number of nodes in G is an even number.

Anyway, in the context of Branch-and-Bound (BAB), you're upper bound is naturally your best (cheapest) incumbent (so far) feasible solution. Such a solution can be constructed heuristically at initialisation with quite ease, e.g.

i.   let G' = G and E'=E
ii.  Choose a node, say i, randomly from G'.
iii. With i fixed, pick the edge in (i,j) in E' that minimises cost,     
     i.e., the cheapest edge from node i to any other node j in G'.
iv.  Remove nodes i and j from G', and remove all edges associated
     with nodes i and j from E'.
v.   Is G' non-empty? Go to ii. Otherwise, terminate.

Or some more clever variation of the above. The attained feasible solution shall be our initial upper bound, say UB.

Now, in the context of BAB, one usually looks at a relaxation of original problem, one that can be solved to optimality with ease; compare with the continuous relaxation of some an integer linear program (ILP) to attain a linear program (LP) readily solved to optimality using the simplex method -- a common use for BAB.

For our purposes, we need to specify a way of relaxing your problem to a form that can be readily solved, where the cost of the optimal solution of this relaxed form is lower than that of the original problem; hence providing a lower bound, say LB, to the latter. If we formalise the problem in mathematical terms, this becomes much easier. We introduce binary variables x_ij (takes values 0 or 1),

x_ij := 1, if pairing between nodes i and j is used, i!=j
        0, otherwise
c_ij := cost (or weight) of using edge (i, j), i!=j

where i, j = 1, ..., |N|, with i!=j. From now now, let n denote |N|, i.e., n=|N|.

With this, we can state the following binary linear program, say (BLP1), to

minimize     ∑{i = 1 to n} ∑_{j 1 to n, j!= i} (c_ij * x_ij)/2,  (a)
subject to   ∑{j != i} x_ij = 1,   for i = 1, ..., n,            (b)
                       x_ij ∈ {0, 1}.                            (c)

The "/2" takes into account that---with the form of the objective function sums---each pair will be associated with two non-zero x_ij variables (i.e., counted twice); if e.g. nodes 1 and 4 are paired, x_(1,4)=x_(4,1)=1. The n number of constraints (b) ensures each node will be paired with exactly one other node.

This program can be relaxed by replacing the binearity constraint (c) with its continuous relaxation, i.e., replace (c) by:

                       x_ij ∈ [0, 1],                            (c')

yielding the following linear program, say (LP1), to

minimize     ∑{i = 1 to n} ∑_{j 1 to n, j!= i} (c_ij * x_ij)/2,  (a)
subject to   ∑{j != i} x_ij = 1,   for i = 1, ..., n,            (b)
                       x_ij ∈ [0, 1].                            (c')

The solution space of (BLP1) is obviously a subspace of (LP1), and hence the optimal solution to (LP1) yields a lower bound for the optimal solution to (BLP1). Since (LP1) is a linear program, it can be readily solved by the Simplex method, using whatever favourite optimisation library you prefer.

Now, for the BAB process, solve (LP1) and in its optimal solution, choose some fractional x_ij, i.e., some x_ij ∈ (0, 1), which we shall---in the branching children of (LP1)---either enforce (up cut) or exclude (down cut). Lets denote this variable x_ab. Branching on this x_ab variable, w.r.t. the graph matching problem, can be described as: "enforce using edge (a,b) with it's full weight (x_ab=1) in subsequent subproblems, or enforce full exclusion of edge (a,b) (x_ab=0) in subsequent subproblems".

Branching on x_ab yields---from (LP1)---two subproblems, say (LP1down) and (LP1up), of the following form

(LP1down)
minimize     ∑{i = 1 to n} ∑_{j 1 to n, j!= i} (c_ij * x_ij)/2,  (a)
subject to   ∑{j != i} x_ij = 1,   for i = 1, ..., n,            (b)
                       x_ij ∈ [0, 1],                            (c')
                       x_ab = 0,                                 (d1)

(LP1up)
minimize     ∑{i = 1 to n} ∑_{j 1 to n, j!= i} (c_ij * x_ij)/2,  (a)
subject to   ∑{j != i} x_ij = 1,   for i = 1, ..., n,            (b)
                       x_ij ∈ [0, 1],                            (c')
                       x_ab = 1,                                 (d2)

Just re-solve these linear programming problems (LP1down) and (LP1up), and iteratively repeat the branch/solve process until subproblems can be pruned, by:

bound: the optimal (linear programming) solution of some sub-problem 
       is larger than the UB of the original problem (BLP1). This mean
       proceeding along such a branch will never give a better (BLP1)
       solution than the best incumbent one.

optimality: optimal (linear programming) solution of some sub-problem
            is binary valued -> feasible in (BLP1).
              -> update UB with new best incumbent solution.

infeasibility: some sub-problem is infeasible; branching upon it will
               still yield an infeasible problem.

If you run your BAB process until termination, you will guarantee optimality (the issues for large problems is rather tractability). Note that if the number of nodes are odd, (BLP1) will be infeasible.

If you do not want to resort to linear programming techniques, try to construct your own way of relaxing your original problem to one that have the following properties

The solution space of your original problem is a subset of the solution space of your relaxed problem.
Your relaxed problem is readily solved by some means, e.g. heuristics.
Decide what "branching" mean in your context: I would suggest fixing inclusion of exclusion of a node pair.

Then simply re-use the general BAB approach as covered briefly above. If you dwell deeper into BAB, there are several ways to improve the framework by choosing how to choose which subproblems to process first ("node picking") or which variables (in formal treatment) to branch upon ("branching rules").