How to dictionary elements of list in a dictionary?

I have printed output following data:

dd={2: [314, 334, 298, 316, 336, 325, 337, 344, 319, 323], 1: [749, 843, 831, 795, 769]}

I tried to cluster list elements of each keys(including 2 & 1) into 2 clusters using kmeans. Here is my code:

from scipy.cluster.vq import kmeans, vq
from collections import defaultdict
import numpy as np

dd={2: [314, 334, 298, 316, 336, 325, 337, 344, 319, 323], 1: [749, 843, 831, 795, 769]}

new_dd = defaultdict(list)

check_cluster_list = [len(x) for ii, x in dd.items()]

number_of_clusters = 2
if number_of_clusters > min(check_cluster_list):
    print("Clusters cannot be larger than", min(check_cluster_list))
    raise Exception(f"Clusters cannot be larger than {min(check_cluster_list)}")

for indx, (id, y) in enumerate(dd.items()):
    cluster_dict = defaultdict(list)

    codebook, _ = kmeans(np.array(y, dtype=float), number_of_clusters)  
    cluster_indices, _ = vq(y, codebook)

But I have to define different numbers of cluster for each key. For example:

key: 2  >> number_of_clusters=3
key: 1  >> number_of_clusters=2

My question is how do I cluster those two lists with different number_of_clusters, 3 and 2 respectively?

Solution

IIUC you just need to make it a list where you define number_of_clusters for each key.

And then use zip to iterate over the dict and the list together.

number_of_clusters = [3, 2]
.
.
.

for num, (indx, (id, y)) in zip(number_of_clusters, enumerate(dd.items())):
    cluster_dict = defaultdict(list)

    codebook, _ = kmeans(np.array(y, dtype=float), num)  
    cluster_indices, _ = vq(y, codebook)
    print(f"{codebook=}\n{cluster_indices=}\n")


codebook=array([319.4 , 298.  , 337.75])
cluster_indices=array([0, 2, 1, 0, 2, 0, 2, 2, 0, 0])

codebook=array([771., 837.])
cluster_indices=array([0, 1, 1, 0, 0])