Suppose we have a particular number of points distributed in the plane at the nodes of a grid having a given step. These points form several continuous regions.
We can tell that points (if there is more than one point) form a continuous region if for each point in region there is another point at a distance less than or equal to step*sqrt(2)
.
It is necessary to have only a single list (or dictionary) of coordinates and grid step, to distribute the coordinates to new arrays. Each new array should contain coordinates that belong to a particular region.
Below is a picture for clarity. In this case, we have three continuous regions.
Now I have the following implementation.
from itertools import groupby
step = 1
regions = [] # list of lists with points from particular regions
while all_equal(list(points.values())) == False: # while not all points have been explored ("points" is a single dict with coordinates, where keys are numbers of points and values are tuples with coordinates)
region = [] # create a list for new region
region.append(points[list(points.keys())[0]]) # add to the new list the first value from "points"
points[list(points.keys())[0]] = None # "None" means that this point has already been explored
for point in points:
for region_point in region:
if points[point] != None: # if this point has not already been explored
if distance(points[point], region_point) <= step*2**0.5: # if these points form a continuous segment
region.append(points[point])
points[point] = None
break
regions.append(region)
points = {k:v for k,v in points.items() if v != None} # remove all explored points from a single dict
for element in regions:
print element
def all_equal(array):
g = groupby(array)
return next(g, True) and not next(g, False)
But this code works incorrectly. Moreover, it is organized in a very suboptimal way. But now I have no idea how it could be realized differently.
An example of input data:
points = {
1: (1, 10),
2: (1, 8),
3: (2, 9),
4: (2, 6),
5: (3, 8),
6: (3, 7),
7: (3, 6),
8: (6, 7),
9: (7, 8),
10: (7, 7),
11: (8, 7),
12: (8, 6),
13: (10, 5),
14: (11, 6),
15: (12, 6),
16: (12, 5),
17: (13, 6),
18: (13, 5)
}
Expected result:
[(1, 10), (1, 8), (2, 9), (2, 6), (3, 8), (3, 7), (3, 6)]
[(6, 7), (7, 8), (7, 7), (8, 7), (8, 6)]
[(10, 5), (11, 6), (12, 6), (12, 5), (13, 6), (13, 5)]
Current result:
[(1, 10), (2, 9), (3, 8), (3, 7), (3, 6)]
[(1, 8)]
[(2, 6)]
[(6, 7), (7, 8), (7, 7), (8, 7), (8, 6)]
[(10, 5), (11, 6), (12, 6), (12, 5), (13, 6), (13, 5)]
Please use Python 2 to solve this problem (or Python 3 without libraries that are missing in Python 2).
This does the trick (tried on Python 2.7.18, works fine, and just so happens to be artisanal code that works on Python 3.11 too).
The idea is that we do a sort of "flood-fill" or walk across the cells.
The output is
[(1, 10), (1, 8), (2, 9), (2, 6), (3, 8), (3, 7), (3, 6)]
[(6, 7), (7, 8), (7, 7), (8, 7), (8, 6)]
[(10, 5), (11, 6), (12, 6), (12, 5), (13, 6), (13, 5)]
as expected.
def get_neighbors(point_id_to_coords, open_point_ids, point_id):
x, y = point_id_to_coords[point_id]
for neighbor in open_point_ids:
nx, ny = point_id_to_coords[neighbor]
if abs(x - nx) <= 1 and abs(y - ny) <= 1:
yield neighbor
def cluster_points(point_id_to_coords):
# Create a set of all point IDs that haven't been visited yet
open_point_ids = set(point_id_to_coords)
# While there are unclustered points, choose a random point
while open_point_ids:
start = open_point_ids.pop()
# Initialize a new cluster, and its open points with only the start point
cluster = {start}
cluster_open = {start}
# While there are still points to be examined for this cluster, choose a random point
while cluster_open:
point = cluster_open.pop()
neighbor_ids = set(get_neighbors(point_id_to_coords, open_point_ids, point))
# For each neighboring point, add it to the cluster and remove it from the unvisited sets
for neighbor in neighbor_ids:
cluster.add(neighbor)
cluster_open.add(neighbor)
open_point_ids.remove(neighbor)
yield cluster
points = {
1: (1, 10),
2: (1, 8),
3: (2, 9),
4: (2, 6),
5: (3, 8),
6: (3, 7),
7: (3, 6),
8: (6, 7),
9: (7, 8),
10: (7, 7),
11: (8, 7),
12: (8, 6),
13: (10, 5),
14: (11, 6),
15: (12, 6),
16: (12, 5),
17: (13, 6),
18: (13, 5),
}
for cluster in cluster_points(points):
print([points[pid] for pid in cluster])