Search code examples
pythonpymc3

PyMC3 select data within model for switchpoint analysis


I am generating a time series that has a drastic change in the middle.

import numpy as np

size = 120 
x1 = np.random.randn(size)
x2 = np.random.randn(size) * 4
x = np.hstack([x1, x2])

This series of x looks like this:

enter image description here

The goal is now to use PyMC3 to estimate the posterior distribution of the time when the change occurred (switchpoint). This should occur around the index 120. I've used the following code;

from pymc3 import Model, Normal, HalfNormal, DiscreteUniform
basic_model = Model()

with basic_model:
    mu1 = Normal('mu1', mu=0, sd=10)
    mu2 = Normal('mu2', mu=0, sd=10)
    sigma1 = HalfNormal('sigma1', sd=2)
    sigma2 = HalfNormal('sigma2', sd=2)
    tau = DiscreteUniform('tau', 0, 240)

    # get likelihoods
    y1 = Normal('y1', mu=mu1, sd=sigma1, observed=x[:tau])
    y2 = Normal('y2', mu=mu2, sd=sigma2, observed=x[tau:])

Doing this gives an error that I cannot use tau to slice the array. What would be the approach to solve this in PyMC? It seems like I'll need the slicing to be done by something stochastic in PyMC.


Solution

  • Turns out PyMC3 has a switch model. Let t be the variable for time.

    import pymc3 as pm
    basic_model = pm.Model()
    
    with basic_model:
        mu1 = pm.Normal('mu1', mu=0, sd=10)
        mu2 = pm.Normal('mu2', mu=0, sd=10)
        sigma1 = pm.HalfNormal('sigma1', sd=2)
        sigma2 = pm.HalfNormal('sigma2', sd=2)
        switchpoint = pm.DiscreteUniform('switchpoint', t.min(), t.max())
    
        tau_mu = pm.switch(t >= switchpoint, mu1, mu2)
        tau_sigma = pm.switch(t >= switchpoint, sigma1, sigma2)
    
        y = pm.Normal('y1', mu=tau_mu, sd=tau_sigma, observed=x)