The Staked Auto Encoder is being applied to a dataset with 25K rows and 18 columns, all float values. SAE is used for feature extraction with encoding & decoding.
When I train the model without feature scaling, the loss is around 50K, even after 200 epochs. But, when scaling is applied the loss is around 3 from the first epoch.
Is it recommended to apply feature scaling when SAE is used for feature extraction
Does it impact accuracy during decoding?
Also worth noting that your much smaller loss after 1 epoch with scaling should be a result of much smaller values used to compute the loss.