I want to plot two feature vector in scatter plot in same figure. I am doing PCA analysis from MNIST.
Current Feature Vector lets call it Elements
has 784 rows.
print Elements.shape
(784,)
I want to plot Elements[-20]
and Elements[-19]
scatter plot in same figure and want to achieve something like below.
I am struggling to add both elements into same plot with different color.
plt.scatter(X[-20], X[-19], c= 'r')
yields only one color and no distinction of scattered value.
As hightlighted below someof my data sets are overlapping and hence below solution from SO doesnt work. SO solution
First 20 data elements of X[-20] are as below.
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
2.84343259e-03 6.22613687e-03 -7.95592208e-15 -1.69063344e-14
1.34798763e-14 0.00000000e+00 6.36473767e-14 -3.18236883e-14
Regarding the visualization issue
You seem to be adding a scalar to your plot. What you need to do is separate your data first, and than do a plot for each of the sets. Like this:
import numpy as np
import matplotlib.pyplot as plt
def populate(a=2,b=5,dev=10, number=400):
X = np.random.uniform(0, 50, number)
Y = a*X+b + np.random.normal(0, dev, X.shape[0])
return X, Y
num = 3000
x1, y1 = populate(number=num)
x2, y2 = populate(-0.2, 110, number=num)
x = np.hstack((x1, x2))
y = np.hstack((y1, y2))
fig, ax = plt.subplots(nrows=1, ncols=1)
plt.scatter(x[:num], y[:num], color="blue", alpha=0.3)
plt.scatter(x[num:], y[num:], color="red", alpha=0.3)
ax = plt.gca()
howblack = 0.15
ax.set_facecolor((howblack, howblack, howblack))
plt.show()
, which results in this:
There are numerical procedures to separate your data but that is not a visualization issue. See scikit-learn for some clustering methods. In your example, assuming the Elements
is some kind of array, you need to find a way to separate the data.
Regarding the feature vector
A scatter plot typically assumes that you have at least X and Y data (so 2D or more).
You seem to be referring to a feature vector which is clearly not enough information since 700 dimensions for a vector is not exactly easy to show. So you need to decide, in your scatter plot what is X, what is Y, and what to separate into different colored populations.