Why have they only used X in for loop and not both X and Y? And why we are using reshape with 1, -1?
# implement a loop which computes Euclidean distances between each element in X and Y
# store results in euclidean_distances_vector_l list
X = np.random.uniform( low=lower_boundary, high=upper_boundary, size=(sample_size, n) )
Y = np.random.uniform( low=lower_boundary, high=upper_boundary, size=(sample_size, n) )
for index, x in enumerate(X):
euclidean_distances_vector_l.append(euclidean_distances(x.reshape(1, -1), Y[index].reshape(1, -1)))
I haven't used numpy much, but here's my best guess at your questions.
The reason the code only iterates through X
instead of X
and Y
both is because the code isn't pairing each value of X
to each value of Y
. Instead, it wants each value in X
along with its corresponding value in Y
. Consider the following example:
X = [0, 1, 2, 3, 4]
Y = [5, 6, 7, 8, 9]
for index, x in enumerate(X):
print(x, Y[index])
# Prints:
# 0 5
# 1 6
# 2 7
# 3 8
# 4 9
As far as your question regarding reshape
, the documentation states that a value of -1 in any parameter indicates that the length of that dimension should be inferred from the length of the original array. My guess then is that x.reshape(1, -1)
will restructure x
into a 2D array where the length of the first dimension is 1 and the length of the second is as long as it needs to be to hold all the values in x
.
X = [1, 2, 3, 4, 5]
X2 = X.reshape(1, -1)
# The value of X2 will be:
# [[1, 2, 3, 4, 5]]