Search code examples
mathformulascale

How do I write a mathematic formula that non-linearly scales a continuous variable to a 0-100 span where f(x)→100 where x→Inf


I am building an index of user-qualities defined as a sum of (often) correlated continious variables representing user-activity. The index is well-calibrated, and servs the purpos of my analysis, but is tricky to communicate to my co-workers, particularly, since outlier activities cause extremely tenatious users to score a very highly on the activity index.

For 97% of users, the index is distributed near-normally between 0 and 100, with a right tail of 3% of hyper-active users with an index > 100. Index-values beyond 200 should be extremely rare but are theoretically possible.

I'm looking to scale the tail back into a 0-100 span, but not linearly, since I would like the 3%-tail to be represented as small variances within the top-range of the 0-100 index. What I'm looking for a non-linear formula to scale my index, like this:

enter image description here

so that the lower tier of the unscaled index remains close to the scaled one, but where high index-values diverge, but where scaled values never reach 100 as my index goes towards infinity, so that x=0=f(x) but when x = 140, f(x) ≈ 99 or something similar

I'll implement the scaling in R, Python and BigQuery.


Solution

  • There are lots of ways to do this: take any function with the right shape and tweak it to your needs.

    One family of functions with the right shape is

    f(x) = x/pow(1 + pow(x/100, n), 1/n)
    

    You can vary the parameter n to adjust the shape: increasing n pushes f(100) closer to 100. With n=5 you get something that looks pretty close to your drawing

    f(x) = x/pow(1 + pow(x/100, 5), 0.2)
    

    enter image description here

    Another option is taking the hyperbolic tangent function tanh which you can of course tweak in similar ways:

    f(x) = 100*pow(tanh(pow(x/100, n)), 1/n)
    

    here's the curve with n=2:

    enter image description here