When I try to cluster using affinity propagation, the below error occurs and the number of clusters is one.
"...\anaconda\lib\site-packages\sklearn\cluster\_affinity_propagation.py:246: ConvergenceWarning: Affinity propagation did not converge, this model will not have any cluster centers.
warnings.warn("Affinity propagation did not converge, this model ""
Below is the code I tried.
def build_feature_matrix(documents, feature_type='frequency',
ngram_range=(1, 1), min_df=0.0, max_df=1.0):
feature_type = feature_type.lower().strip()
if feature_type == 'binary':
vectorizer = CountVectorizer(binary=True, min_df=min_df,
max_df=max_df, ngram_range=ngram_range)
elif feature_type == 'frequency':
vectorizer = CountVectorizer(binary=False, min_df=min_df,
max_df=max_df, ngram_range=ngram_range)
elif feature_type == 'tfidf':
vectorizer = TfidfVectorizer(min_df=min_df, max_df=max_df,
ngram_range=ngram_range)
else:
raise Exception("Wrong feature type entered. Possible values: 'binary', 'frequency', 'tfidf'")
feature_matrix = vectorizer.fit_transform(documents).astype(float)
return vectorizer, feature_matrix
vectorizer, feature_matrix = build_feature_matrix(filtered_list_6,
feature_type='tfidf',
min_df=0.15, max_df=0.85,
ngram_range=(1, 2))
def affinity_propagation(feature_matrix):
sim = feature_matrix * feature_matrix.T
sim = sim.todense()
ap = AffinityPropagation()
ap.fit(sim)
clusters = ap.labels_
return ap, clusters
ap_obj, clusters = affinity_propagation(feature_matrix=feature_matrix)
df[len(df.columns)] = clusters
c = Counter(clusters)
print(c.items())
total_clusters = len(c)
print('Total Clusters:', total_clusters)
Could someone point what I am doing wrong here?
Thanks in advance!
I could change the damping value, max_iter and preference values to eliminate the issue. Initially you can start with damping = 0.9, max_iter = 1000.
You can change the preference value as needed and this will change the number of clusters generated by the model