Background: I am trying to run a nearest neighbor using the cKDtree
function on a shapefile that has 201 records with lat/lons against a time series dataset of 8760 hours (total hours in a year). I am getting an error, naturally I looked it up. I found this: scipy.spatial ValueError: "x must consist of vectors of length %d but has shape %s" which is the same error, but I am having trouble understanding how exactly this error was resolved.
Workflow: I pulled the x & y coordinates out of the shapefile and stored them in separate arrays called x_vector
and y_vector
. The 8760 data is an hdf5 file. I pulled the coordinates out using h5_coords = np.vstack([meta['latitude'], meta['longitude']]).T
.
Now I try to run the kdtree,
# Run the kdtree to match nearest values
tree = cKDTree(np.vstack([x_vector, y_vector]))
kdtree_indices = tree.query(h5_coords)[1]
but it results in this same traceback error.
Traceback Error:
Traceback (most recent call last):
File "meera_extract.py", line 45, in <module>
kdtree_indices = tree.query(h5_coords)[1]
File "scipy/spatial/ckdtree.pyx", line 618, in scipy.spatial.ckdtree.cKDTree.query (scipy/spatial/ckdtree.cxx:6996)
ValueError: x must consist of vectors of length 201 but has shape (1, 389880)
Help me, stackoverflow. You're my only hope.
So it seems I need to read up on the differences of vstack
& column_stack
and the use of transpose i.e. .T
. If anyone has the same issue here is what I changed to make the cKDtree
work. Hopefully it will help if someone else runs into this issue. Many thanks to comments from the community to help solve this!
I changed how the hdf5
coordinates were brought in from vstack
to column_stack
and removing the transpose .T
.
# Get coordinates of HDF5 file
h5_coords = np.column_stack([meta['latitude'], meta['longitude']])
Instead of trying to add the points in the tree I made a new variable to hold them:
# combine x and y
vector_pnts = np.column_stack([x_vector, y_vector])
Then I ran the kdtree without any error.
# Run the kdtree to match nearest values
tree = cKDTree(vector_pnts)
kdtree_indices = tree.query(h5_coords)[1]