Search code examples
pythonscikit-learnpca

Getting pca.explained_variance_ratio_ for all components without doing PCA twice


I understand that explained_variance_ratio_ can be obtained easily using PCA but will be restricted to the contribution from the first n_components. I was wondering if explained_variance_ratio_ can be obtained for all components without doing PCA twice, after all the parameters are derived after having the full eigen values. This I intend to do as the matrix is huge and I am looking to reduce time in the computation.


Solution

  • To answer your question:

    Yes, you can obtain the explained_variance_ratio_ for all components without doing PCA twice. When you perform PCA, you can specify the number of components you want to keep. If you don't specify this number, PCA will keep all the components.

    If you want to reduce the number of components after fitting the PCA, you can do so without having to fit the PCA again. Try the folllowing:

    from sklearn.decomposition import PCA
    import numpy as np
    
    pca = PCA()
    pca.fit(X_train)
    

    Now to print explained varience ratio:

    print(pca.explained_variance_ratio_)
    

    Get the explained variance ratio for the first 10 components

    explained_variance_ratio_10 = pca.explained_variance_ratio_[:10]