I would like the mathematical proof of it. does anyone know a paper for it. or can workout the math?
https://pvirie.wordpress.com/2016/03/29/linear-autoencoders-do-pca/ PCA is restricted to a linear map, while auto encoders can have nonlinear enoder/decoders.
A single layer auto encoder with linear transfer function is nearly equivalent to PCA, where nearly means that the WW found by AE and PCA won't be the same--but the subspace spanned by the respective WW's will.