Search code examples
pythongpflow

GPflow multi-output change-point


I would like to construct a multi-output GP, whereby the correlation structure between outputs contains a changepoint. The change should only occur in the correlation structure of the Coregion kernel, whereas the kernels themselves (i.e., lengthscale and family of kernel) should remain the same before and after the change.

Below, I include examples (from the GPflow documentation [1., 2.], and my own [3.]) which:

  1. have correlation structure between outputs, but no changepoints,
  2. demonstrates how to construct changepoints in GPflow,
  3. my attempt at a correlation structure between outputs which contains a change point.
    X1 = np.random.rand(100, 1)  # Observed locations for first output
    X2 = np.random.rand(50, 1) * 0.5  # Observed locations for second output
    

    Y1 = np.sin(6 * X1) + np.random.randn(*X1.shape) * 0.03
    Y2 = np.sin(6 * X2 + 0.7) + np.random.randn(*X2.shape) * 0.1


    # Augment the input with ones or zeros to indicate the required output dimension
    X_augmented = np.vstack((np.hstack((X1, np.zeros_like(X1))),
                             np.hstack((X2, np.ones_like(X2)))))
                             

    # Augment the Y data with ones or zeros that specify a likelihood from the list of likelihoods
    Y_augmented = np.vstack((np.hstack((Y1, np.zeros_like(Y1))),
                             np.hstack((Y2, np.ones_like(Y2)))))

    output_dim = 2  # Number of outputs
    rank = 1  # Rank of W

    # Base kernel
    k = gpflow.kernels.Matern32(active_dims=[0])

    # Coregion kernel
    coreg = gpflow.kernels.Coregion(output_dim=output_dim, rank=rank, active_dims=[1])

    kern = k * coreg
base_k1 = gpflow.kernels.Matern32(lengthscales=0.2)
base_k2 = gpflow.kernels.Matern32(lengthscales=2.0)
k = gpflow.kernels.ChangePoints([base_k1, base_k2], locations = [0.5], steepness=5.0)
output_dim = 2  # Number of outputs
rank = 1  # Rank of W


# Base kernel
k_base = gpflow.kernels.Matern32(active_dims=[0])

# Coregion kernels
coreg_1 = gpflow.kernels.Coregion(output_dim=output_dim, rank=rank, active_dims=[1])
coreg_2 = gpflow.kernels.Coregion(output_dim=output_dim, rank=rank, active_dims=[1])

k_1 = k_base * coreg_1
k_2 = k_base * coreg_2

k = gpflow.kernels.ChangePoints([k_1, k_2], [0.5], steepness=50.0)
gpflow.set_trainable(k.locations, False); gpflow.set_trainable(k.steepness, False)

When I try to fit this, using the following code:

lik = gpflow.likelihoods.SwitchedLikelihood(
    [gpflow.likelihoods.Gaussian(), gpflow.likelihoods.Gaussian()]
)

# now build the GP model as normal
m_change = gpflow.models.VGP((X_augmented, Y_augmented), kernel=k, likelihood=lik)

# fit the covariance function parameters
maxiter = ci_niter(10000)
gpflow.optimizers.Scipy().minimize(
    m_change.training_loss, m_change.trainable_variables, options=dict(maxiter=maxiter), method="L-BFGS-B",
)

I get the error "Dimensions must be equal", and I can't seem to do anything to rectify this.

My questions are:

  1. How can I get this model to fit at all?
  2. In it's current state, I think this set up would fit a different lengthscale before and after the changepoint, whereas I only wish for the correlation structure in the outputs to change. How can I set the lengthscales to be the same single trainable parameter?

please note: this is my first question here. I have attempted to stick to the guidelines, but please prompt if there are changes to my question which would make it more suitable/answerable.


Solution

  • Unfortunately, there is currently no MultiOutput support for ChangePoint kernels in GPflow. In your case, this essentially means that the ChangePoint kernel has no idea on what dimension of your outputs to act, even though the kernels that constitute it have their active_dims parameters set.

    I have a Pull Request in the works to implement this functionality that you can find here: https://github.com/GPflow/GPflow/pull/1671

    The change proposed in that pull request simply would require you to add a switch_dim flag in your call to the ChangePoint kernel, like so:

    k = gpflow.kernels.ChangePoints([k_1, k_2], locations=[0.5], steepness=50.0,
                                    switch_dim=1) # <-- This one!
    

    If you would like to try that functionality, you can install GPflow with the proposed change, for example using pip like so:

    pip install git+https://github.com/GPflow/GPflow.git@refs/pull/1671/head
    

    Alternatively, you could locate the gpflow/kernels/changepoints.py in your GPflow sources and implement the changes you find in the pull request manually.

    If you decide to do so, please be aware that this proposed change has not been extensively tested, and is not a supported feature of GPflow (yet).


    Regarding you second question, the way you have the model set up currently would fit only one Matern kernel before and after the ChangePoint, since you are using the same k_base instance for both k_1 and k_2. This means you are already fitting the same lengthscale on both sides of the CP, which seems to be the setup you are looking for.