Suppose I have a set of entities (for example people with their physical characteristics) and I want to find, for a given entity X, all entities related (or similar) to it, for some definition of similarity.
I can easily find such entities for one dimension (all people with height Y ~= X's height within a certain threshold) but is there some approach that I can use to find similar entities considering more than one attribute?
It is going to depend on what you define as similarity, but you can use the same approach you take for 1D, to any dimension, with a small generalization. Assuming each element is represented as a vector, you can measure the distance of 2 vectors x,y
as d=|x-y|
, and accept/reject depending on this d
and some threshold.
In here, the minus operator is vector negation:
(a1,a2,...,an)-(b1,b2,...,bn)=(a1-b1,a2-b2,...,an-bn)
and the absolute value is again for vectors:
|(a1,a2,...,an)| = sqrt(a1^2 + a2^2 + ... + an^2)
.
It is easy to see that this is generalization of your 1D example, and invoking the same approach for vectors with a single element will do the same.
Downside of this approach is (0,0,0,...,0,10^20)
and (0,0,0,....,0)
will be very far away from each other - which might or might not be what you are after, and then you might need a different distance metric - but that really depends on what exactly are you after.