Search code examples
machine-learningmahoutpearson

Finding dissimilar dimensions in a feature vector in Mahout


If I use a similarity based algorithm such as pearson correlation score to compare two feature vectors and I want to know those dimensions/feature fields which are very much dissimilar amongst the feature set then what is the algorithm to be used? I am using Mahout which is a machine learning library for Java


Solution

  • Well, it would just be the dimension in which the two vectors differed most -- in which the absolute value of the difference of the vectors' values in the dimension was largest. Is that really all you mean or are you looking for something subtler?