matplotlib geometry networkx visualization graph-visualization

networkx: how to draw bounding area containing a set of nodes?

Here is the expected input & output of my problem:

Input:
- A G = nx.DiGraph(), usually the DAG with the number of nodes <20.
- A fixed precomputed layout pos, usually by graphviz_layout(G, prog='dot').
- A subset of nodes to highlight.
Ouput:
- Graph plot with a bounding area containing only the subset of nodes. Here is an illustration example:

The bounding area is expected to be (just some preliminary intuitions, not hard requirements):

In a smoothed polygon shape. For the smoothness, we can imagine that the area can be drawn by a brush with some stroke radius.
Contains only the given subset of nodes. Or heuristically, minimizing the maximum distance within/through the bounding area between any two nodes in the subset, and maximizing the minimum distance from any node not in the subset to the area.

Here are some of materials I searched that might help:

how to draw communities with networkx : the graph should look very like this. The only difference is that, I have a fixed pos, and the subset of nodes can be taken arbitrarily. I cannot adjust layout to make some "community" sense.
Networkx: How to visually group a set of nodes
Find the area of a bounding polygon that encloses a set of points

Is it possible to achieve what I want?

Solution

This isn't a perfect answer but some progress towards the goal. The basic ideas are:

Construct a new graph, that is the minimum spanning tree (MST) of the geometric graph covering the specified subset, i.e. the full graph where each edge has a weight equal to the distance between the nodes.
Convert the MST edges into yet another graph, in which each edge is subsampled into multiple short edges with control points between them (think pearls on a string). We then simulate a physical system, in which nodes repel edge control points but edges act as contracting springs. This yields a set of curved edge paths that avoid other nodes.
Define two elevations: a) points on the edge paths, and b) the excluded points. Perform nearest neighbour interpolation to partition the rest of the canvas into these two levels, and determine a contour line that partitions them.

Idea no. 1 is fairly straightforward.

Idea no. 2 is more involved but already implemented in netgraph (pip install netgraph), which is a graph library I wrote and maintain.

Idea no. 3 is so-so. I think something like snakes/active contours would yield better results but I can't find an implementation for sparse data, only implementations for image data.

import numpy as np
import matplotlib.pyplot as plt
import networkx as nx

from itertools import combinations
from scipy.spatial.distance import cdist
from scipy.spatial.ckdtree import cKDTree
from scipy.ndimage import gaussian_filter
from matplotlib.path import Path
from matplotlib.patches import PathPatch

from netgraph import (
    Graph,
    get_sugiyama_layout,
    get_curved_edge_paths
)


if __name__ == '__main__':

    # fig, axes = plt.subplots(1, 3, sharex=True, sharey=True, figsize=(12, 4))
    fig, axes = plt.subplots(2, 2, sharex=True, sharey=True, figsize=(10, 10))
    axes = axes.ravel()

    n = 20
    g = nx.gnr_graph(n, 0.01)
    subset = np.random.choice(n, 5, replace=False)
    node_color = {node : 'red' if node in subset else 'blue' for node in g}
    node_positions = get_sugiyama_layout(list(g.edges)) # a.k.a 'dot'
    Graph(g, edge_width=0.5, node_layout=node_positions, node_color=node_color, arrows=True, ax=axes[0])

    # --------------------------------------------------------------------------------
    # Using the nodes in the subset, construct the minimum spanning tree using distance as the weight parameter.
    xy = np.array([node_positions[node] for node in subset])
    distances = cdist(xy, xy)
    h = nx.Graph()
    h.add_weighted_edges_from([(subset[ii], subset[jj], distances[ii, jj]) for ii, jj in combinations(range(len(subset)), 2)])
    h = nx.minimum_spanning_tree(h)

    Graph(g, edge_width=0.5, node_layout=node_positions, node_color=node_color, arrows=True, ax=axes[1])
    Graph(h, nodes=list(g.nodes), edge_width=3, edge_color='red', node_layout=node_positions, node_color=node_color, ax=axes[1])

    # --------------------------------------------------------------------------------
    # Compute an edge routing that avoids other nodes. Here I use
    # a modified version of the Fruchterman-Reingold algorithm to
    # place edge control points while avoiding the nodes.
    # Change the default origin and scale to make the canvas a bit
    # larger such that the curved edges can curve outside the bbox
    # covering the nodes.
    edge_paths = get_curved_edge_paths(list(h.edges), node_positions, k=0.25, origin=(-0.5, -0.5), scale=(2, 2))
    Graph(g, edge_width=0.5, node_layout=node_positions, node_color=node_color, arrows=True, ax=axes[2])
    for _, path in edge_paths.items():
        x, y = path.T
        axes[2].plot(x, y, color='magenta', linewidth=10, zorder=-1)

    # --------------------------------------------------------------------------------
    # Use nearest neighbour interpolation to partition the canvas into 2 regions.

    xy1 = np.concatenate(list(edge_paths.values()), axis=0)
    z1 = np.ones(len(xy1))

    xy2 = np.array([node_positions[node] for node in node_positions if node not in subset])
    z2 = np.zeros(len(xy2))

    # Add a frame around the axes.
    # This reduces the desired mask in regions where there are no nearby point from the exclusion list.
    xmin, xmax = axes[2].get_xlim()
    ymin, ymax = axes[2].get_ylim()
    xx = np.linspace(xmin, xmax, 100)
    yy = np.linspace(ymin, ymax, 100)

    left = np.c_[np.full_like(xx, xmin), yy]
    top = np.c_[xx, np.full_like(yy, ymax)]
    right = np.c_[np.full_like(xx, xmax), yy]
    bottom = np.c_[xx, np.full_like(yy, ymin)]

    xy3 = np.concatenate([left, top, right, bottom], axis=0)
    z3 = np.zeros((len(xy3)))

    xy = np.concatenate([xy1, xy2, xy3], axis=0)
    z = np.concatenate([z1, z2, z3])
    tree = cKDTree(xy)
    xy_grid = np.meshgrid(xx, yy)
    _, indices = tree.query(np.reshape(xy_grid, (2, -1)).T, k=1)
    z_grid = z[indices].reshape(xy_grid[0].shape)

    # smooth output
    z_smooth = gaussian_filter(z_grid, 1.5)

    Graph(g, edge_width=0.5, node_layout=node_positions, node_color=node_color, arrows=True, ax=axes[3])
    contour = axes[3].contour(xy_grid[0], xy_grid[1], z_smooth, np.array([0.9]), alpha=0)
    patch = PathPatch(contour.collections[0].get_paths()[0], facecolor='red', alpha=0.5, zorder=-1)
    axes[3].add_patch(patch)

    plt.show()