Search code examples
rlinepointdiagonaldistance-matrix

How to calculate a distance matrix from a diagonal line?


Consider we have a data frame called Pred.

It consists of 1 user per row.

The users are specified by their unique userID.

Users can be grouped by the cluster they belong to.

Users reported their Confidence and Challenge for a task, this information is saved as Conf and Chall respectively.

Note that both Conf and Chall have the same range from i.e., 1-6.

cluster   userID      Conf        Chall
1           A           5           3
2           B           3           2
1           C           6           1
1           D           3           4
2           E           2           4
2           F           3           5
1           G           6           2
1           H           5           5
2           I           6           2
2           J           5           4
2           K           1           1
1           L           3           5
1           M           4           4

Let's say we make a scatter-plot where Conf is on x-axis and Chall is on y-axis.

The points where:

Conf == Chall

would be on the diagonal line which passes through the origin.

Now I am interested in finding the distance of each user from the diagonal line based on their coordinates:

(Conf, Chall) 

Overall, the question deals with finding the distance of points (Conf, Chall) from the line at the diagonal.

Note: Please note that I am not interested in plotting the graph. I am interested in calculating a distance vector.

I understand, it is perhaps a very basic question but I have been struggling for the past few days. A simple demo example code would help me understand this problem.

I would appreciate any guidance on this!


Solution

  • The Euclidean distance from a point (x,y) to the diagonal line is given by

    abs(x - y) / sqrt(2)
    

    See, e.g., here. Thus, you may use

    (Pred$distance <- abs(Pred$Conf - Pred$Chall) / sqrt(2))
    # [1] 1.4142136 0.7071068 3.5355339 0.7071068 1.4142136 1.4142136 2.8284271
    # [8] 0.0000000 2.8284271 0.7071068 0.0000000 1.4142136 0.0000000