Search code examples
pythonpandasnumpydataframenormalize

How to scale a numpy array from 0 to 1 with overshoot?


I am trying to scale a pandas or numpy array from 0 to a unknown max value with the defined number replaced with 1.

One solution I tried is just dividing the defined number I want by the array.

test = df['Temp'] / 33

plot

This method does not scale all the way from 0 and I'm stuck trying to figure out a better mathematical way of solving this.


Solution

  • First, transform the DataFrame to a numpy array

    import numpy as np
    T = np.array(df['Temp'])
    

    Then scale it to a [0, 1] interval:

    def scale(A):
        return (A-np.min(A))/(np.max(A) - np.min(A))
    
    T_scaled = scale(T)
    

    Then transform it to anywhere you want, e.g. to [55..100]

    T2 = 55 + 45*T_scaled
    

    I'm sure that this can be done within Pandas too (but I'm not familiar with it). Perhaps you might study Pandas df.apply()