This is my code. I'm getting broad cast error.I'm unable to understand why?I have looked at other similar questions, which spoke about problems with dimensions, but I was unable to find out the problem.Any help is appreciated. Thanks in advance. I have attached the image. Broadcast error
Both arrays ( ns and X_train_grade_encoded) are of the same shape , but there is error why?
So I looked at your notebook image. It is a small png that requires zoom to read. We strongly encourage, some even demand, that you copy-n-paste code and errors. We need to see the problem, right up front, not hidden. Otherwise we are likely to move to the next question.
broadcast
errors usually occur when doing some sort of math on two arrays, or when (my second guess) assigning one array to a slice of another. But this case is a more obscure one, trying to make an object dtype array from (n,4) and (n,300) shaped arrays.
You are doing hstack((ns, array2))
. With an ordinary np.hstack
that would work and produce a (n, 304) shaped array. But you are using scipy.sparse.hstack
. I don't know if that was intentional or a mistake. You haven't hinted that you are working the sparse
matrices.
ns
probably was constructed from a sparse matrix, since you use toarray()
. But it is now a dense (numpy) array.
sparse.hstack
is intended for sparse matrices, returning a sparse matrix. I don't know the exact limits on using dense array inputs. I believe it can convert dense to coo
sparse and then do its join, but here the error occurred before it got to that step.
This reproduces your error:
In [37]: from scipy import sparse
Trying to use sparse hstack on two dense arrays:
In [38]: sparse.hstack([np.ones((3,4)),np.zeros((3,2))])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-38-a9d8036b5a44> in <module>
----> 1 sparse.hstack([np.ones((3,4)),np.zeros((3,2))])
/usr/local/lib/python3.6/dist-packages/scipy/sparse/construct.py in hstack(blocks, format, dtype)
463
464 """
--> 465 return bmat([blocks], format=format, dtype=dtype)
466
467
/usr/local/lib/python3.6/dist-packages/scipy/sparse/construct.py in bmat(blocks, format, dtype)
543 """
544
--> 545 blocks = np.asarray(blocks, dtype='object')
546
547 if blocks.ndim != 2:
/usr/local/lib/python3.6/dist-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
83
84 """
---> 85 return array(a, dtype, copy=False, order=order)
86
87
ValueError: could not broadcast input array from shape (3,4) into shape (3)
But if we first convert one (even the 2nd) to sparse:
In [39]: sparse.hstack([np.ones((3,4)),sparse.coo_matrix(np.zeros((3,2)))])
Out[39]:
<3x6 sparse matrix of type '<class 'numpy.float64'>'
with 12 stored elements in COOrdinate format>
In [40]: _.A
Out[40]:
array([[1., 1., 1., 1., 0., 0.],
[1., 1., 1., 1., 0., 0.],
[1., 1., 1., 1., 0., 0.]])
of course the right way to join two dense arrays:
In [41]: np.hstack([np.ones((3,4)),np.zeros((3,2))])
Out[41]:
array([[1., 1., 1., 1., 0., 0.],
[1., 1., 1., 1., 0., 0.],
[1., 1., 1., 1., 0., 0.]])
The array(...,object)
error is a bit obscure; it arises because both arrays are dense and have the same first dimension. It's a known issue in numpy
. Since sparse.hstack
was intended for use on sparse matrices, its developers can be excused for ignoring this numpy
misuse.
===
sparse.vstack
does work with dense arrays, with shapes like (3,4) and (5,4), because np.array(..., object)
does make a valid object dtype array. But if the shapes match, e.g. (3,4) and (3,4), neither hstack
nor vstack
work, but the error message is different from yours.
In [66]: sparse.hstack((np.ones((3,2)),np.zeros((3,2))))
...
ValueError: blocks must be 2-D
So we need to the take the docs seriously.