Search code examples
pythonnumpymatplotlibiris-dataset

Matplotlib is not showing my scatterplot?


When I use plt.show the plot only shows the PCA lines and not a scatterplot of the first 2 iris features

import numpy as np
import matplotlib.pylab as plt
from sklearn import decomposition

x = np.load("iris_features.npy")[:, :2]
y = np.load("iris_labels.npy")
idx = np.where(y != 0)
x = x[idx]
x[:, 0] -= x[:, 0].mean()
x[:, 1] -= x[:, 1].mean()


pca = decomposition.PCA(n_components=2)
pca.fit(x)
v = pca.explained_variance_ratio_


plt.scatter(x[:, 0], x[:, 1], marker='o', color='b')
ax = plt.axes()
x0 = v[0] * pca.components_[0, 0]
y0 = v[0] * pca.components_[0, 1]
ax.arrow(0, 0, x0, y0, head_width=0.05,) head_length=0.1, fc='r', ec='r')
x1 = v[1] * pca.components_[1, 0]
y1 = v[1] * pca.components_[1, 1]
ax.arrow(0, 0, x1, y1, head_width=0.05, head_length=0.1, fc='r', ec='r')
plt.xlabel("$x_0$", fontsize=16)
plt.ylabel("$x_1$", fontsize=16)
plt.show()

resulting plot

What the correct plot should look like


Solution

  • Your code seems to work fine using the iris dataset from sklearn, and produces the expected result. You can try that, or you can share your dataset in a testable way, as in principle you could have whatever data in these csv files.

    import numpy as np
    import matplotlib.pylab as plt
    from sklearn import decomposition
    from sklearn import datasets
    
    iris = datasets.load_iris()
    
    x = iris.data
    y = iris.target
    
    idx = np.where(y != 0)
    x = x[idx]
    ....  # your code continues here
    

    If that doesn't work, share or directly update your python, matplotlib version.