K Means Cluster Update Assignment

I am running python 3 and am implementing a K Means cluster file. I have wrote the functions for a Euclidean distance, and assigning data. Now I want to update the assignment and write a function that that returns a new dictionary whose values are the centroids key names and value a list of points that belong to the centroid.

def update_assignment(data, centroids):

I know I need to reuse the assign_data function I created earlier. I want to do this without using numpy, but am totally stuck. Looking for suggestions. Do I need to iterate through the data again and have an if statement that compares the previous distance? It seems like I would not need to call the previous distances since I have already created a function for it. Any help would be much appreciated.

Solution

Yes, you need to iterate through the data. Initialize a dictionary where each centroid is mapped to an empty list. then for each data point x you can use a list comprehension to find the distances to each centroid, something like:

[euclidean_distance(x, c) for c in centroids]

The index of the smallest element in this list identifies the new centroid. Then you can append x to the corresponding list in that dictionary.