Using GPflow 2.0, I want to implement a custom Matern 5/2 kernel with the Haversine instead of Euclidean distance. I created a custom class on top of the gpflow.kernels.Matern52
class that includes a scaled_squared_dist
function to override the scaled_squared_euclid_dist
inherited from the Stationary
class.
The class as currently written does not alter the Matern52 class; GP regression with this HaversineKernel_Matern52
kernel behaves exactly like GP regression with a Matern52 kernel.
import gpflow
from gpflow.utilities.ops import square_distance
class HaversineKernel_Matern52(gpflow.kernels.Matern52):
"""
Isotropic Matern52 Kernel with Haversine distance instead of euclidean distance.
Assumes 2-dimensional inputs, with columns [latitude, longitude] in degrees.
"""
def __init__(self, lengthscale=1.0, variance=1.0, active_dims=None, ard=None):
super().__init__(active_dims=active_dims, variance=variance,
lengthscale=lengthscale, ard=ard)
def haversine_dist(self, X, X2):
pi = np.pi / 180
f = tf.expand_dims(X * pi, -2) # ... x N x 1 x D
f2 = tf.expand_dims(X2 * pi, -3) # ... x 1 x M x D
d = tf.sin((f - f2) / 2) ** 2
lat1, lat2 = tf.expand_dims(X[:, 0] * pi, -1), \
tf.expand_dims(X2[:, 0] * pi, -2)
cos_prod = tf.cos(lat2) * tf.cos(lat1)
a = d[:,:,0] + cos_prod * d[:,:,1]
c = tf.asin(tf.sqrt(a)) * 6371 * 2
return c
def scaled_squared_dist(self, X, X2=None):
"""
Returns ||(X - X2ᵀ) / ℓ||² i.e. squared L2-norm.
"""
X_scaled = X / self.lengthscale
X2_scaled = X2 / self.lengthscale if X2 is not None else X2
return square_distance(X_scaled, X2_scaled)
What do I need to change to make this kernel recalculate Haversine distance properly?
This question builds on GPflow issue #969.
Thank you!
The GP code makes use of a kernel's K
(and K_diag
) methods. In GPflow 2.0.0rc1 and the develop branch, for subclasses of Stationary
, K
calls self.scaled_squared_euclid_dist
-- but the method you define in your Haversine version is called scaled_squared_dist
, so this is a new method and you don't actually overwrite its base class method from the Matern52
kernel class! (Maybe the method in gpflow.kernels.stationaries.Stationary would be better called scaled_squared_dist
instead.)
Moreover, your scaled_squared_dist
only calls square_distance
instead of self.haversine_dist
; assuming the latter returns a distance, not its square, you also need to wrap it in tf.square()
. The haversine_dist
method also does not seem to take into account the lengthscale
parameter.
If you want to try out several different kernels with a Haversine distance, a more robust/reusable way of coding this up might be to write a wrapper class that takes any stationary kernel as an argument, and redefining the kernel matrix methods:
def haversine_dist(X, X2):
pi = np.pi / 180
f = tf.expand_dims(X * pi, -2) # ... x N x 1 x D
f2 = tf.expand_dims(X2 * pi, -3) # ... x 1 x M x D
d = tf.sin((f - f2) / 2) ** 2
lat1, lat2 = tf.expand_dims(X[:, 0] * pi, -1), \
tf.expand_dims(X2[:, 0] * pi, -2)
cos_prod = tf.cos(lat2) * tf.cos(lat1)
a = d[:,:,0] + cos_prod * d[:,:,1]
c = tf.asin(tf.sqrt(a)) * 6371 * 2
return c
class HaversineDistance(gpflow.kernels.Stationary):
def __init__(self, base: gpflow.kernels.Stationary):
self.base = base
@property
def active_dims(self):
return self.base.active_dims
@property
def variance(self):
return self.base.variance # for K_diag to work
def K(self, X, X2=None, presliced=False):
if not presliced:
X, X2 = self.slice(X, X2)
if X2 is None:
X2 = X
# Note that lengthscale is in Haversine distance space:
r = haversine_dist(X, X2) / self.base.lengthscale
if hasattr(self.base, "K_r"):
return self.base.K_r(r)
else:
return self.base.K_r2(tf.square(r))
where I factored out your haversine_dist
into its own function (this would make it easier to write tests for!). This works with any stationary kernel, whether SquaredExponential
(which defines K_r2
) or Matern52
(which defines K_r
). You can use it simply as follows:
k = HaversineDistance(gpflow.kernels.SquaredExponential())
or with a Matern 3/2 as the base kernel:
k = HaversineDistance(gpflow.kernels.Matern32())
Note that, as always, hyperparameter initialisation is important, and you need to remember that the lengthscale of the base kernel acts within the Haversine distance space and so needs to be appropriate for that - depending on your dataset. Either pass it in as the lengthscale
and variance
parameters when constructing the base kernel:
k = HaversineDistance(gpflow.kernels.Matern32(variance=5.0, lengthscale=0.1))
or by using assign()
on the gpflow.Parameter attributes of the base kernel:
k.base.variance.assign(5.0)
k.base.lengthscale.assign(0.1)