What is a chain in PyMC3?

I am learning PyMC3 for Bayesian modeling. You can create a model and sample with:

import pandas as pd
import pymc3 as pm

# obs is a DataFrame with a single column, containing
# the observed values for variable height
obs = pd.DataFrame(...)

# we create a pymc3 model
with pm.Model() as m:
    mu = pm.Normal('mu', mu=178, sd=20)
    sigma = pm.Uniform('sigma', lower=0, upper=50)
    height = pm.Normal('height', mu=mu, sd=sigma, observed=obs)
    trace = pm.sample(1000, tune=1000)

pm.traceplot(trace)

When I check the trace (in this case 1000 samples from the posterior probability), I notice that 2 chains are created:

>>> trace.nchains
2

I read the tutorial on PyMC3 and looked through the API but it is unclear to me what a chain represents (in this case I asked for 1000 samples from the posterior but I got 2 chains, each one with 1000 samples from the posterior).

Are the chains different runs of the sampler with the same parameters or do they have some other meaning/purpose?

Solution

A chain is a single run of MCMC. So if you have six 2-d parameters in your model and ask for 1000 samples, you will get six 2x1000 arrays for each chain.

When running MCMC, it is a best practice to use multiple chains, as they can help diagnose problems. For example, the Gelman-Rubin diagnostic requires multiple chains, and runs automatically (using joblib, which tries to use multiple cores if possible) if you use more than 1 chain in PyMC3.

As a concrete example of when you might want multiple chains, consider sampling from a multimodal distribution. Even the NUTS sampler may not visit both modes in a single chain, but you could diagnose this using multiple chains.

Note that PyMC3 usually combines the chains when you work with them (e.g., using trace.get_values('my_var')), since they are all valid MCMC samples. This does lead to some confusing behavior in that asking for 1000 samples actually gets you 4000 on most systems, where you get 4 chains by default.