For a game, I made a territory map, consisting of pixels, with each territory having a different color. From there, I want to add names on every territory.
For visual purposes, I want to put names on the centroid of the area. Therefore,, I used PIL to convert the image into a single large matrix. I built a class to record the centroid data of each territory, collected in a dictionary. Then, I iterate over the pixels to process the centroid. This method is very slow and takes around a minute for 2400 x 1100 map.
territory_map = numpy.array([
[0, 0, 0, 1, 0, 0, 0],
[0, 2, 2, 1, 0, 0, 0],
[2, 2, 1, 1, 3, 3, 3],
[2, 0, 0, 1, 3, 0, 0],
])
centroid_data = {}
class CentroidRecord(object):
def __init__(self, x, y):
super(CentroidRecord, self).__init__()
self.x = float(x)
self.y = float(y)
self.volume = 1
def add_mass(self, x, y):
# new_x = (old_x * old_volume + x) / (old_volume + 1),
# therefore new_x = old_x + (x - old_x) / v,
# for v = volume + 1.
self.volume += 1
self.x += (x - self.x) / self.volume
self.y += (y - self.y) / self.volume
for y in range(territory_map.shape[0]):
for x in range(territory_map.shape[1]):
cell = territory_map[y][x]
if cell == 0:
continue
if cell not in centroid_data:
centroid_data[cell] = CentroidRecord(x, y)
else:
centroid_data[cell].add_mass(x, y)
for area in centroid_data:
data = centroid_data[area]
print(f"{area}: ({data.x}, {data.y})")
This should print the following:
1: (2.8, 1.6)
2: (0.8, 1.8)
3: (4.75, 2.25)
Is there any faster method to do this?
Each coordinate of the centroid for a colour is simply the mean of all the coordinates of points of that colour. Accordingly, we can use a dict
comprehension:
import numpy as np
n_colours = territory_map.max()
{i: tuple(c.mean() for c in np.where(territory_map.T == i))
for i in range(1, n_colours + 1)}
Output:
{1: (2.8, 1.6),
2: (0.8, 1.8),
3: (4.75, 2.25)}
Note that we need to take the transpose because rows (y-coordinate) come before columns (x-coordinate) in numpy
.
Time taken on randomly generated data:
81.6 ms ± 5.36 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)