So i'm trying to implement the agglomerative clustering algorithm and to check the distances between each cluster i use this:
a, b = None, None
c = max
for i in range(len(map)-1):
for n in range(len(map[i])):
for j in range(i+1, len(map)):
for m in range(len(map[j])):
//dist is distance func.
d = dist(map[i][n], map[j][m])
if c > d:
a, b, c = i, j, d
print(a, ' ', b)
return a, b
map looks like this: { 0: [[1,2,3], [2,2,2]], 1: [[3,3,3]], 2: [[4,4,4], [5,5,5]] }
What I expect from this is for each row item to compare with every row/col of every other row. So something like this:
comparisons: [1,2,3] and [3,3,3], [1,2,3] and [4,4,4], [1,2,3] and [5,5,5], [2,2,2] and [3,3,3] and so on
When I run this it only works 1 time and fails any subsequent try after at line 6 with KeyError.
I suspect that the problem is either here or in merging clusters.
If map
is a dict
of values, you have a general problem with your indexing:
for m in range(len(map[j])):
You use range()
to create numerical indices. However, what you need j
to be in this example is a valid key of the dictionary map
.
EDIT:
That is - of course - assuming that you did not use 0-based incremented integers as the key of map
, in which cause you might as well have gone with a list
. In general you seem to be relying on the ordering provided in a list
or OrderedDict
(or dict
in Python3.6+ as an implementation detail). See for j in range(i+1, len(map)):
as a good example. Therefore I would advise using a list
.
EDIT 2: Alternatively, create a list of the map.keys()
and use it to index the map
:
a, b = None, None
c = max
keys = list(map.keys())
for i in range(len(map)-1):
for n in range(len(map[keys[i]])):
for j in range(i+1, len(map)):
for m in range(len(map[keys[j]])):
#dist is distance func.
d = dist(map[keys[i]][n], map[keys[j]][m])
if c > d:
a, b, c = i, j, d
print(a, ' ', b)
return a, b