Search code examples
pythonnumpyscipypca

Principal component analysis in Python


I'd like to use principal component analysis (PCA) for dimensionality reduction. Does numpy or scipy already have it, or do I have to roll my own using numpy.linalg.eigh?

I don't just want to use singular value decomposition (SVD) because my input data are quite high-dimensional (~460 dimensions), so I think SVD will be slower than computing the eigenvectors of the covariance matrix.

I was hoping to find a premade, debugged implementation that already makes the right decisions for when to use which method, and which maybe does other optimizations that I don't know about.


Solution

  • You might have a look at MDP.

    I have not had the chance to test it myself, but I've bookmarked it exactly for the PCA functionality.