So I'm creating a master program for machine learning from scratch in python and the first step i want to do is to check if the data set is normal or not. ps : the data set can have many features or just a single feature.
It has to be implemented in python3.
also, normalizing the data can be done by the below function right :
# Find the min and max values for each column
def dataset_minmax(dataset):
minmax = list()
for i in range(len(dataset[0])):
col_values = [row[i] for row in dataset]
value_min = min(col_values)
value_max = max(col_values)
minmax.append([value_min, value_max])
return minmax
# Rescale dataset columns to the range 0-1
def normalize_dataset(dataset, minmax):
for row in dataset:
for i in range(len(row)):
row[i] = (row[i] - minmax[i][0]) / (minmax[i][1] - minmax[i][0])
THANKS IN ADVANCE!
Your question seems discordant: if your features are not coming from a normal distribution, you cannot "normalize" them, in the sense of changing their distribution. If you mean to check if they have average 0 and SD of 1 that is a different ballpark game.