I am using Keras to setup neural networks. As input data, I use vectors in which each coordinate can be either 0 (feature not present or not measured) or a value that can range for instance between 5000 and 10000.
So my input value distribution is a kind of gaussian centered let us say around 7500 plus a very thin peak at 0.
I cannot remove the vectors with 0 in some of their coordinates because almost all of them will have some 0s at some locations.
So my question is : "how to best normalize the input vectors ?". I see two possibilities :
Does someone have an advice on how to proceed ?
Thanks !
Instead, represent your features as 2 dimensions:
You can think of this as encoding extra feature saying "the other feature is missing". This way scale of each feature is normalised, and all informatino preserved