Suppose I have an object X
with a set of 10 features: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
.
Then, I have two more objects:
A : [2, 2, 2, 2, 2, 2, 2, 2, 2, 2]
B : [0, 0, 0, 0, 0, 0, 0, 0, 0, 20]
I need to know which from A
or B
is "closer" to X
.
The idea I have in mind behind "similarity" is:
It is better that all features are nearly the same, rather than many are very close but some very different.
According to this "definition", A
seems closer to X
than B
.
However, the arithmetic mean does not seem to be the right tool to implement this idea because it is 2 for both objects.
Is there a particular metric for this kind of problem, please?
What about the euclidean distance?
In your case, the Euclidean distance between A and X is the square root of 40 (= 6.32 approximately) and the distance between B and X is 20, so A is indeed more similar by that metric.