I am trying to map reinforcement learning continuous action values range(-1.0,1.0)
to real output.
Suppose I have numpy action array actions = np.array([-1., 0.2, -0.3, 0.5])
. Values for the array may be actions.min = -1.
and actions.max = 1.
.
Now,I need to map each entry to separate range for example
So, Mapped action mapped_actions = np.array([-0.4, 1.484, -0.86, -0.085])
My current solution is as followes:
import numpy as np
actions = np.array([-1., 0.2, -0.3, 0.5])
mapped_low_high = np.array([[-0.4, 1.54], [1.4, 1.54],[-2.4, 2.], [-1.54, 0.4]])
mapped_actions = np.zeros_like(actions)
for i in range(actions.shape[0]):
mapped_actions[i] = np.interp(actions[i], (-1., 1.), mapped_low_high[i])
Looping with numpy arrays is slow. It is faster to write a vectorized function that can operate on the entire array at once. Since we know that the given array will always be between -1 and 1 and we have an array of the new range, the code is a simple function that maps from one range to the other. The general technique is to map the original data to [0, 1]
and then map that to [a, b]
.
import numpy as np
actions = np.array([-1., 0.2, -0.3, 0.5])
mapped_low_high = np.array([[-0.4, 1.54],
[1.4, 1.54],
[-2.4, 2.],
[-1.54, 0.4]])
def remap(arr, mappings):
out = arr.copy()
a, b = mappings.T
# remap out to [0,1]
# assumes arr is [-1,1]
out += 1.
out /= 2.
# remap out to [a,b]
out *= b - a
out += a
return out
remapped = remap(actions, mapped_low_high) # array([-0.4 , 1.484, -0.86 , -0.085])