Search code examples
pythoncluster-analysis

How to get the probability percentage of a model.predict() when clustering documents


text = "Some random text string that I want to cluster"
Y = vectorizer.transform([text])
prediction = model.predict(Y)
print(prediction)

the above passes through a value which is a string and then it it returns the cluster group it thinks it belongs in (one of three).

How can I find out what the percentage of its prediction accuracy is. ie. this particular text is 90% consistent with group 1, the next text might be 45% consistent with group 2 but it will still go into group 2 none the less. I want to be able to catch items with a low accuracy.


Solution

  • Not at all, usually.

    Even some (few) clusterers work with some probability inside, and may have a predict_proba function to get these values, these values rather capture a relative responsibility than an accuracy.