Search code examples
pythonpandaspython-2.7ranking

Ranking Pandas DataFrame Based on Closeness to a Specified Value


Apologies if this is a trivial task. I am very new to coding and Python and am learning as part of my dissertation.

I have a data frame and want to rank each column within it based on its closeness to a specified value, rather than an ascending or descending order.

I am working on a way to compare Running/Cycling routes. As part of this process I am trying to find how a query route compares to a target route based on a few different attributes: Distance, Elevation Gain, Elevation Loss and Gradient. My resultant data frame, shows the error between the two routes in each attribute within the comparison (i.e. [the target route value - the query route value] / the target route value). The problem that I am currently facing is ranking these results. As a perfect match would be a value of 0, I want to rank the values based on their closeness to this.

The data frame to be ranked:

scores = pd.DataFrame({'distance':[0.15, 0.07, -0.09, 0], 'elevation_gain': 
        [-0.19,-8.39, -0.86, 0],'elevation_loss':[-3.73, -2.51, -0.16, 0], 
        'gradient': [0.12, 0.39, 2.77, 0]})

In this case, the 4th route is the query route, as such the result is a perfect match and should therefore be ranked 1st.
As there are negative values, I don't think, a descending ranking would be suitable.

what I am aiming for is:

ranks = pd.DataFrame({'distance':[4, 2, 3, 1], 'elevation_gain': [2,4, 3, 
      1],'elevation_loss':[4, 3, 2, 1], 'gradient': [2, 3, 4, 1]})

(Appologies I don't know how to visualise these data frames to make this easier to digest)

I could then create a new column, summing the ranks and the lowest score would indicate the best match.

Thanks for any help in advance!


Solution

  • Try this:

    ranks = scores.abs().apply(pd.Series.rank).astype(int)
    ranks 
    

    Output:

       distance  elevation_gain  elevation_loss  gradient
    0         4               2               4         2
    1         2               4               3         3
    2         3               3               2         4
    3         1               1               1         1