Im new to Python, so I execute this code:
test1 = np.array([95, 91, 104, 93, 85, 107, 97, 90, 86, 93, 86, 90, 88, 89, 94, 96, 89, 99, 104, 101, 84, 84, 94, 87, 99, 85, 83, 107, 102, 80, 89, 88, 93, 101, 87, 100, 82, 90, 106, 81, 95])
plt.hist(test1)
plt.show()
After I normalize data and check the plot again:
plt.gcf().clear()
test2 = preprocessing.normalize([test1])
plt.hist(test2)
plt.show()
The new plot has different shape and on the histagram I see that every number represents once, which looks strange for me comparing to first plot. So I expect smth similat to first plot, but with range from 0 to 1. Where am I mistaking?
Here is one solution. You need MinMaxScaler
whose default range for normalizing is (0,1)
. For more info, refer to this official page from sklearn
.
from sklearn import preprocessing
test1 = np.array([95, 91, 104, 93, 85, 107, 97, 90, 86, 93, 86, 90, 88, 89, 94, 96, 89, 99, 104, 101, 84, 84, 94, 87, 99, 85, 83, 107, 102, 80, 89, 88, 93, 101, 87, 100, 82, 90, 106, 81, 95])
min_max_scaler = preprocessing.MinMaxScaler()
test2 = min_max_scaler.fit_transform(test1.reshape(-1, 1));
plt.hist(test2)
Output