I have 15000 uint images and i have vectorized them to give it as an input to my Convolutional Neural network.[15000x8192] My question is regarding scaling as if i scale like below i get the good result
scaler= MinMaxScaler()
x_train= scaler.transform(x_train)
but if i do the following, i don't
x_train= xtrain./65535
The maximum and minimum pixel value for my images are 31,238 & 16841. Is the first approach correct while dealing with images?
I found the third approach which is below, it looks more reasonable
X_set_uint8 = cv2.normalize(X_set_16, None, 0, 255, cv2.NORM_MINMAX, dtype=cv2.CV_8U)
# Normalize pixel values to be between 0 and 1
X_set_scaled= X_train_uint8/255
All approaches should give the same performance and they are not. That's what is confusing for me.
Okay, so you've revealed you are using data from a spectrograph! Remember how I said the most important thing is to think about your data?
We know you need to normalize your data, since the network will converge faster. Ideally we want them normally distributed.
One huge problem with a spectrogram is that standard normalization techniques are of no use since the data is very heavy tailed.
You'll probably want to take an adjusted logarithm of your values: take log(x + c) where you adjust c until you see something Gaussian. A more advanced technique would be to use the Box-Cox transformation.
Now for normalizing by the min and max, you'll probably want to use the minimum and maximum values for a spectrogram instead of what your data shows.
This answer to this depends on the nature of your input data. Note the answer below applies to both classification tasks and regression tasks.
Transformed = (I - I.mean) / I.std
since they expect Gaussian data.Remember, the goal of normalization is to scale your domain down to [0, 1]. You should always think about how the transformation will effect in sample and possible out of sample images. What will you teach the model? Do out of sample images fall in that same image space? What possible transformations might best map the training and out-of-sample images to a similar domain?