Search code examples
scikit-learnjupyter-notebookpca

Why does the kernel restart when I try sklearn PCA?


I use Ipython Notebook and when I input the code:

import numpy as np
from sklearn.decomposition import PCA
pca = PCA(n_components=2)
pca.fit(data)

I receive a notice that the kernel has died and has restarted. What is going on?

Also my data is in this format:

array([[  0.00000000e+00,   3.13000000e+02,   3.10000000e+02, ...,
      9.00000000e+00,   6.00000000e+00,   2.00000000e+01],
      [  3.00000000e+00,   2.06900000e+03,   2.06700000e+03, ...,
      1.90000000e+01,   7.00000000e+00,   3.20000000e+01],
      [  4.00000000e+00,   2.54200000e+03,   2.54000000e+03, ...,
      1.10000000e+01,   1.10000000e+01,   1.10000000e+01],

EDIT:

The data itself is not that large (~3 MB). If it helps, I am using ipython notebook.

I tried a simple 3x3 test matrix as input and same problem, so it's probably not something with the data size either:

data = np.array([[1,2,3],[1,4,6],[2,8,11]])

import numpy as np
from sklearn.decomposition import PCA
pca = PCA(n_components=2)
pca.fit(data)

I tried the sklearn's pca in the terminal with python as well:

>>> from sklearn.decomposition import PCA
>>> pca = PCA()
>>> import numpy as np
>>> X = np.array([[1,2,3],[1,5,7],[2,6,10]])
>>> y = np.array[1,2,3]
>>> y = np.array([1,2,3])
>>> pca.fit(X, y)

And got:

Illegal instruction (core dumped)

Solution

  • It seems that sklearn will not run nicely on a 32 bit machine so when I ran this later on a 64 bit server it worked!!!!!