python numpy plot multidimensional-array line

Generate multi-dimensional line

I am generating a multi-dimensional line. Shouldn't the projection of the line over each dimension be linear? The plot aren't.

from matplotlib import pyplot as plt
import numpy as np

n = 100  # samples
m = 2  # dimensions

X = np.random.randint(0, 100, size=(n, m))
b = np.random.randint(1, 3, m).reshape([m, 1])

y = np.dot(X, b)

for i in range(m):

  plt.scatter(X[:,i], y)
  plt.show()

Solution

This looks like a possible misunderstanding of the dot product. Consider a scaled-down example with n=3:

 X
Out[572]: 
array([[86, 85],
       [60, 37],
       [36, 57]])

In [573]: y
Out[573]: 
array([[342],
       [194],
       [186]])

In [574]: 86+2*85
Out[586]: 256

In [587]: b
Out[587]: 
array([[2],
       [2]])

Notice that the first value of X is 2*86 + 2*85, as expected by the definition of the dot product. So the ratio of y to X[0][0] here is about 3.9. For the 2nd value of X, the ratio of y to X[1][0] is about 3.2. Clearly not a constant ratio, so the fist components of the X vectors don't have a linear relationship with y, as you saw in your plots. Why is this?

Consider some other case where the first component of x[0]is 86; the second component could be anything in the given range (they're generated randomly). So why would we expect any particular ratio between the first component of X[0] and y?

Imagine the case where the first component of X[0] was 0 and d was [2, 2]. y[0] is not guaranteed to be 0; y[0] will be 2 times the second component of X[0]. The relationship exists between the vectors in X to the scalers in y; not between the components of the vectors in X and the scalars in y.

What you may want instead is to have X be scalar valued (X=np.random.randint(1,100, size=n)and y be a vector. Then generate y as X*b. Now plotting X vs each dimension in y would be a line.