I am using kNN algorithm to classify. In weka they have provided various parameter setting for kNN. I am intersted to know about the distanceWeighting, meanSquared.
In distanceWeighting we have three values (No distance weighting, weight by 1/distance and weight by 1-distance). What are these values and what is their impact?
Can someone please expalin me? :)
If one uses "no distance weighting", then the predicted value for your data points is the average of all k neighbors. For example
# if values_of_3_neigbors = 4, 5, 6
# then predicted_value = (4+5+6)/3 = 5
For 1/distance weighting, the weight of each neigbor is inversely proportional to the distance to it. The idea is: the closer the neighbor, the more it influences the predicted value. For example
# distance_to_3_neigbors = 1,3,5
# weights_of_neighbors = 1/1, 1/3, 1/5 # sum = 1 + 0.33 + 0.2 = 1.53
# normalized_weights_of_neighbors = 1/1.53, 0.33/1.53, 0.2/1.53 = 0.654, 0.216, 0.131
# then predicted_values = 4*0.654 + 5*0.216 + 6*0.131 = 4.48
For 1-distance it is similar. This is only applicable when all your distances are in the [0,1] range.
Hope this helps