I'm now trying to use Kendall's distance to improve sets of rankings based on Borda counts method.
I'm asked to follow a specific document's instructions. In the document it states that :
"The Kendall's distance counts the pairwise disagreements between items from two rankings as :
where
The Kendall's distance is normalized by its maximum value C2n. The less the Kendall’s distance is, the greater the similarity degree between the rankings is.
The Kendall's tau is another method for measuring the similarity degree between rankings, which is easy to be confused with the Kendall's distance.
The Kendall's tau is defined as:
The Kendall's tau is defined based on the normalized Kendall's distance. Note that the greater the Kendall's tau is, the greater the similarity degree between the compared rankings is. In this paper, we use the Kendall's distance rather than the Kendall's tau."
My goal is to improve the following ranking by using Kendall's distance :
x1 x2 x3 x4
A1 4 1 3 2
A2 4 1 3 2
A3 4 3 2 1
A4 1 4 3 2
A5 1 2 4 3
In this ranking, the ith row represents the ranking obtained based on Ai, and each column represents the ranking position of the corresponding item in each ranking. (i.e. xn represents the items to be ranked, Ai represents the ones who rank the items.)
I don't understand what's the difference between the two distances despite the explanation of the doc. And what what does the "(j,s), j != s" beneath the sigma symbol stand for? And finally how to implement Kendall's distance in the ranking provided above?
Distance and similarity are two related concepts, but for distance, exact identity means distance 0, and as things get more different, the distance between them gets greater, with no very obvious fixed limit. A well-behaved distance will obey the rules for a metric - see https://en.wikipedia.org/wiki/Metric_(mathematics). For a similarity, exact identity means similarity 1, and similarity decreases as things get greater, but usually never decreases below 0. Kendall's tau seems to be a way of turning Kendall's distance into a similarity.
"(j,s), j != s" means consider all possibilities for j and s except those for which j = s.
You can compute Kendall's distance by simply summing over all possibilities for j not equal to s - but the time taken for this goes up with the square of the number of items. There are ways for which the time taken only goes up as n * log(n) where n is the number of items - for this and much other stuff on Kendall see https://en.wikipedia.org/wiki/Kendall_rank_correlation_coefficient