Search code examples
pythonscikit-learnmnist

Why am I getting the following connectionreseterror for mnist = fetch_mldata?


I am getting a connectionreseterror whenever I attempt to fetch data from mnist, not sure why its happening.This is a tutorial from sklearn for doing PCA and t-sne dimensionality reduction of data. I thought this may be a problem of python versions but does not work in 2.6, 3.5 or 3.7

from sklearn.datasets import fetch_mldata

mnist = fetch_mldata("MNIST original")
X = mnist.data / 255.0
y = mnist.target

ConnectionResetError                      Traceback (most recent call last)
<ipython-input-11-781ac9f03cc8> in <module>()
----> 1 mnist = fetch_mldata("MNIST original")
      2 X = mnist.data / 255.0
      3 y = mnist.target

/anaconda3/envs/py35/lib/python3.5/site-packages/sklearn/datasets/mldata.py in fetch_mldata(dataname, target_name, data_name, transpose_data, data_home)
    152         urlname = MLDATA_BASE_URL % quote(dataname)
    153         try:
--> 154             mldata_url = urlopen(urlname)
    155         except HTTPError as e:
    156             if e.code == 404:

ConnectionResetError: [Errno 54] Connection reset by peer

Solution

  • fetch_mldata is deprecated since scikit-learn v0.20, and replaced with fetch_openml; here is how you should use it for MNIST in v0.21:

    from sklearn.datasets import fetch_openml
    X, y = fetch_openml('mnist_784', version=1, return_X_y=True)
    

    See the documentation for an example.