I need to compute a number of geometric ranges from arbitrary points to surface heights in HAE from a digital elevation model (DEM). This surface height data is provided in WGS84 Lat/Long/MSL elevation, which I can adjust into HAE from a geoid model and store prior to program execution. I can translate the required DEM data for the needed area around the reference point into an array of local-level points (East/North/Up is what I'm using here).
The reference points data is provided in a local-level reference based on a specific ENU origin point in Lat/Long/Height which I don't know until runtime. The most significant compute task is an iteration of various distance computations between (E,N,U) reference points in the local-level frame and up to ~100,000 sampled points in the surface reference contour. Not a big problem for NumPy.
The DEM data translated into the ENU frame is not regular in East/North alignment. The algorithm I'm implementing requires me to return ranges to regularly spaced East/North points and return computations specific to a raster of East/North surface points. My ENU-space DEM object is a ~6000x7000x3 ndarray with the relevant search area surface locations. Viewed in East/North it's warped a bit like a trapezoid that's wider and slightly bowed in the direction of the equator as a result of the transformation.
The best I've been able to do is:
import numpy as np
from scipy.interpolate import NearestNDInterpolator
# Fake DEM data (don't know how to warp it in the way the TED output is, so just adding some rand) This
# could use a rectilinear interpolator, but I believe my data cannot (warped in E as a function of N and
# vice versa).
e_dim = 7000
n_dim = 6000
e_rng = np.arange(-e_dim/2, e_dim/2, dtype='float32')*72 + np.random.randint(-100, high=101, size=e_dim)
n_rng = np.arange(-n_dim/2, n_dim/2, dtype='float32')*95 + np.random.randint(-100, high=101, size=n_dim)
dem_u = np.random.randint(-100, high=4500, size=(n_dim,e_dim))
row, col = np.meshgrid(n_rng, e_rng, indexing='ij')
ENU = np.dstack((row,col,dem_u))
# looking for a faster approach to lookup Nx2 arrays of E/N points that are on a regular sample grid for # rastered final output
interp = NearestNDInterpolator(
(ENU[:,:,0].flatten(),ENU[:,:,1].flatten()),
ENU[:,:,2].flatten())
# times about 12-13s on my hardware--pretty darn slow.
interval_m = 11338.7 # Just a coarse starting sample dimension
samp_offsets_m = np.array([-150*1852+i*interval_m for i in
range(pts_side) ] )
tiles_x, tiles_y = np.meshgrid(samp_offsets_m, samp_offsets_m, indexing='xy' )
pTL = np.asarray( [[*tiles_x.flatten()], E pos
[*tiles_y.flatten()]]) N pos
interp(pTL[0], pTL[1]) # pretty fast, about 4ms to update 'Up' on ~2500 samples.
Is there a better way to create a regularly aligned structure to do DEM lookups into in the local-level plane? My target environment can't run GDAL/Rasterio, etc. I can prep the DEM set with those tools but it isn't in the runtime environment.
Provided your procedure is correct for the task you are performing and we want to keep it, there is few improvement to do.
As you noticed, building the interpolator is time consuming:
%timeit -n 1 -r 30 interp = NearestNDInterpolator((ENU[:,:,0].flatten(), ENU[:,:,1].flatten()), ENU[:,:,2].flatten())
# 11.8 s ± 488 ms per loop (mean ± std. dev. of 30 runs, 1 loop each)
And takes almost the same amount of time on my machine, they should be comparable.
First, you can use ravel
instead of flatten
that will avoid a copy of your arrays and spare a bit of memory.
%timeit -n 1 -r 30 interp = NearestNDInterpolator((ENU[:,:,0].ravel(), ENU[:,:,1].ravel()), ENU[:,:,2].ravel())
# 11.7 s ± 630 ms per loop (mean ± std. dev. of 30 runs, 1 loop each)
But it does not makes any time improvement, it's just about memory.
Now we must consider the fact that most of the time spent by this step is building the underlying KD-Tree. Fortunately, NearestNDInterpolator
gives you the opportunity to setup how this cKDTree
must be built.
%timeit -n 1 -r 30 interp = NearestNDInterpolator((ENU[:,:,0].ravel(), ENU[:,:,1].ravel()), ENU[:,:,2].ravel(), tree_options={"compact_nodes": False, "balanced_tree": False, "leafsize": 64})
# 3.67 s ± 254 ms per loop (mean ± std. dev. of 30 runs, 1 loop each)
Making a coarser tree greatly reduces the building time, but may increase the prediction time.
If we compare, prediction time (with pt_side=2500
):
%timeit -n 1 -r 30 interp(pTL[0], pTL[1])
# Default tree: 4.36 s ± 149 ms per loop (mean ± std. dev. of 30 runs, 1 loop each)
# Tweaked tree: 5.47 s ± 139 ms per loop (mean ± std. dev. of 30 runs, 1 loop each)
Coarser tree requires about 1s more to predict, but needs 8 seconds less to build.
11.7 - 3.67 = 8.03
(11.7 + 4.36) / (3.67 + 5.47) = 1.757111597374179
(11.7 + 4.36) - (3.67 + 5.47) = 6.919999999999998
This is not a 10 or 100 factor optimization, but keeping your procedure as is it's possible to reduce the execution time for a factor of about 1.75 (sparing about 7s by run) by tweaking the tree configuration.