Search code examples
rrangerescale

Rescale a variable with median 1, minimum value 0 and no limit on the maximum value


I am new to statistics, so I excuse myself if this question is trivial

I have a variable that is normally distributed with a range between -15 and +15 like the following one:

 df <- data.frame("weight" = runif(1000, min=-15, max=15), stringsAsFactors = FALSE)

The median and mean value of this variable is 0.

I need to transform this variable to use it as a weight in my regression. For substantive reasons, it does not make any sense to have negative values in my variable (it is itself the result of previous transformations). Negative values of my variable should simply reduce the effects of my main explanatory variable (hence should be bounded between 0 and 1) while positive values should have a multiplicative effect on my explanatory variable (greater than 1). While values close to 0 of my weight should have no effect on my explanatory variable (close to 1).

Hence I would like centre my variable so that the minimum value of my weight is 0 and the median value becomes 1, while I do not want to put constraints on the maximum value thought this will necessarily change the mean (it will become greater than 1). I am not concerned about this provided that the median remains 1.

so far I have considered standardizing the variable between 0 and 2

 library(BBmisc)
 df$normalizedweight <- normalize(df$weight, method = "range",
        range = c(0, 2)) 

however, this operation puts an unnecessary constraint to my normalized variable as the effect of my weight can be greater than a factor of two, while

To clarify, in the real data, negative values of the weight are perfectly mirroring positive values of the weight. Ideally, once I have standardized the data, I would want that multiplying the same number by the maximum and minimum value of the weight, would increase/decrease the value by the same proportion. For example, taking the value of the response variable of 5 both for the maximum (10) and minimum value of my weight, the minimum value should be 0.1, so that 5*10 and 5*0.1, would be and proportional increase/decrease by a factor of 10 of my original value.

I thank you in advance for all the help you are able to provide

Best


Solution

  • One option is to used the exponential transformation. All your negative values will be between 0 and 1, and all your positive values will be over 1. And your median will be close to 1. Moreover, as exp() will create very large value (exp(15) = 3 269 017), you can first divided your values by its maximum.

    sample <- runif(10000, min=-15, max=15)
    
    sample_transform = exp(sample / max(sample))
    median(sample_transform)
    # [1] 0.9930663
    hist(sample_transform)
    

    enter image description here