Search code examples
algorithmsortingscoring

Non-linear comparison sorting / scoring


I have an array I want to sort based on assigning a score to each element in the array.

Let's say the possible score range is 0-100. And to get that score we are going to use 2 comparison data points, one with a weighting of 75 and one with a weighting of 25. Let's call them valueA and valueB. And we will transpose each value into a score. So:

valueA (range = 0-10,000)
valueB (range = 0-70)
scoreA (range = 0 - 75)
scoreB (range = 0 - 25)
scoreTotal = scoreA + scoreB (0 - 100)

Now the question is how to transpose valueA to scoreA in a non-linear way with heavier weighting for being close to the min value. What I mean by that is that for valueA, 0 would be a perfect score (75), but a value of say 20 would give a mid-point score of 37.5 and a value of say 100 would give a very low score of say 5, and then everything greater would trend towards 0 (e.g. a value of 5,000 would be essentially 0). Ideally I could setup a curve with a few data points (say 4 quartile points) and then the algorithm would fit to that curve. Or maybe the simplest solution is to create a bunch of points on the curve (say 10) and do a linear transposition between each of those 10 points? But I'm hoping there is a much simpler algorithm to accomplish this without figuring out all the points on the curve myself and then having to tweak 10+ variables. I'd rather 1 or 2 inputs to define how steep the curve is. Possible?

I don't need something super complex or accurate, just a simple algorithm so there is greater weighting for being close to the min of the range, and way less weighting for being close to the max of the range. Hopefully this makes sense.

My stats math is so rusty I'm not even sure what this is called for searching for a solution. All those years of calculus and statistics for naught.

I'm implementing this in Objective C, but any c-ish/java-ish pseudo code would be fine.


Solution

  • A function you may want to try is

    max / [(log(x+2)/log(2))^N]
    

    where max is either 75 or 25 in your case. The log(x+2)/log(2) part ensures that f(0) == max (you can substitute log(x+C)/log(C) here for any C > 0; a higher C will slow the curve's descent); the ^N determines how quickly your function drops to 0 (you can play around with the function here to get a picture of what's going on)