I am trying to scale two different sets of data to be visually equivalent.
Green data set has extreme Y values and significantly more data points. Hence Orange data set falls flat and short.
What functions exist that allow me to scale them equivalently with one another?
*Future viewers: 'MinMax normalization' is one method as mentioned by the responses.
You can do this by squeezing the values between 0 and 1.
import numpy as np
import matplotlib.pyplot as plt
# Define the green and orange data sets
green_data = np.random.normal(50, 10, 100)
orange_data = np.random.normal(25, 5, 10)
# Normalize the data sets using min-max scaling
green_data_normalized = (green_data - np.min(green_data)) / (np.max(green_data) - np.min(green_data))
orange_data_normalized = (orange_data - np.min(orange_data)) / (np.max(orange_data) - np.min(orange_data))
# Plot the normalized data sets
plt.plot(green_data_normalized, label='Green Data')
plt.plot(orange_data_normalized, label='Orange Data')
plt.legend()
plt.show()
Edit: If you want to be able to get the orange values to have the same x-width as your green values, you can draw a straight line between each point, and use the midpoint to figure out what data point should go in between. This will widen the line by creating more data points, numpy
has this built in with np.interp (short for interpolate).
import numpy as np
import matplotlib.pyplot as plt
# Define the green and orange data sets
green_data = np.random.normal(50, 10, 100)
orange_data = np.random.normal(25, 5, 10)
# Define the x-values for the original and extended orange data
x_orange_original = np.linspace(0, 1, len(orange_data))
x_orange_extended = np.linspace(0, 1, len(green_data))
# Interpolate the orange data to extend it
orange_data_extended = np.interp(x_orange_extended, x_orange_original, orange_data)
# Normalize the data sets using min-max scaling
green_data_normalized = (green_data - np.min(green_data)) / (np.max(green_data) - np.min(green_data))
orange_data_normalized = (orange_data_extended - np.min(orange_data_extended)) / (np.max(orange_data_extended) - np.min(orange_data_extended))
# Plot the normalized data sets
plt.plot(green_data_normalized, label='Green Data')
plt.plot(orange_data_normalized, label='Orange Data')
plt.legend()
plt.show()