I am using scikit learn PCA and trying to choose the minimum number of components that satisfies 1-(sum i 1 to k Sii)/(sum j 1 to n Sjj) <= 0.01 where S is the svd diagonal matrix, in order to have 99% of the variance retained.
Thanks.
Simply set n_components
to be float
, and it will be used as a lower bound of explained variance.
From scikit-learn documentation
n_components : int, None or string
Number of components to keep. if n_components is not set all components are kept: n_components == min(n_samples, n_features) if n_components == ‘mle’, Minka’s MLE is used to guess the dimension if 0 < n_components < 1, select the number of components such that the amount of variance that needs to be explained is greater than the percentage specified by n_components