Principal component analysis in Python
I'd like to use principal component analysis (PCA) for dimensionality reduction. Does numpy or scipy already have it, or do I have to roll my own using numpy.linalg.eigh
?
I don't just want to use singular value decomposition (SVD) because my input data are quite high-dimensional (~460 dimensions), so I think SVD will be slower than computing the eigenvectors of the covariance matrix.
I was hoping to find a premade, debugged implementation that already makes the right decisions for when to use which method, and which maybe does other optimizations that I don't know about.