numpy random scikit-learn sklearn-pandas

sklearn random_state is not working properly

I read everything related to this but still did not understand what the problem is really. Basically I use TruncatedSVD with random_state and then print explained_variance_ratio_.sum() for it. It changes every time I run the code. Is this normal?

from sklearn.decomposition import TruncatedSVD
SVD = TruncatedSVD(n_components=40, n_iter=7, random_state=42)

XSVD = SVD.fit_transform(X)
print(SVD.explained_variance_ratio_.sum())

The problem is later I use umap and plot the result graph. And I have different graphs everytime I run the code. I do not understand if this is due to TruncatedSVD or UMAP. I use random_state=42 to stop things to change but it looks like there is no effect really.

Solution

You should probably do something wrong, because I cannot reproduce your issue with scikit-learn 0.22

In [16]: import numpy as np 
    ...: from sklearn.decomposition import TruncatedSVD 
    ...:  
    ...: rng = np.random.RandomState(42) 
    ...: X = rng.randn(10000, 100) 
    ...: def func(X): 
    ...:     SVD = TruncatedSVD(n_components=40, n_iter=7, random_state=42) 
    ...:     XSVD = SVD.fit_transform(X) 
    ...:     print(SVD.explained_variance_ratio_.sum()) 
    ...: func(X);func(X);func(X);                                                                                                                                           
0.43320350603512425
0.43320350603512425
0.43320350603512425