Search code examples
pythonbayesianpymc3arviz

PyMC3 and Arviz: Visualizing highest posterior density for multiple conditions using arviz plot_hpd


I am trying to visualize simple linear regression with highest posterior density (hpd) for multiple groups. However, I have a problem to apply hpd for each condition. Whenever I ran this code, I am extracting the same posterior density for each condition. I would like to visualize posterior density that corresponds to it's condition. How can I plot hpd for each group?

EDIT: Issue has been solved in PyMC3 discourse

import pymc3 as pm
import numpy as np
import matplotlib.pyplot as plt
import arviz as az
import pandas as pd

# data

data = pd.read_csv('www_MCMC/MCMC/data.csv')
rsp = data['Mean Response'].values
rt = data['Mean Reaction Time'].values
idx = pd.Categorical(data['Structure'], categories=['No Background', 'Only Road', 'Only Dot Ground', 'Dot Terrain + Dot Ground', 'Space', 'Full Background']).codes
groups = len(np.unique(idx))

# model

with pm.Model() as rsp_rt:
        
    α = pm.Normal('α', mu=0, sd=10, shape=groups)
    β = pm.Normal('β', mu=0, sd=10, shape=groups)
    ϵ = pm.HalfCauchy('ϵ', 10)
    
    μ = pm.Deterministic('μ', α[idx] + β[idx] * rt)
    
    y_pred = pm.Normal('y_pred2', mu=μ, sd=ϵ, observed=rsp)
    
    trace_rsp_rt = pm.sample(cores=1) 
    
_, ax_rsp_rt = plt.subplots(2, 3, figsize=(10, 5), sharex=True, sharey=True, constrained_layout=True)
ax_rsp_rt = np.ravel(ax_rsp_rt)

for i in range(groups):
    
    alpha = trace_rsp_rt['α'][:, i].mean()
    beta = trace_rsp_rt['β'][:, i].mean()
    
    ax_rsp_rt[i].plot(rt, alpha + beta * rt, c='k', label= f'rsp = {alpha:.2f} + {beta:.2f} * rt')
    az.plot_hpd(rt, trace_rsp_rt['μ'], credible_interval=0.98, color='k', ax=ax_rsp_rt[i])
    ax_rsp_rt[i].set_title(f'$\mu_{i}$')
    ax_rsp_rt[i].set_xlabel(f'$x_{i}$')
    ax_rsp_rt[i].set_ylabel(f'$y_{i}$', labelpad=17, rotation=0)
    ax_rsp_rt[i].legend()
    plt.xlim(1.2, 1.8)
    plt.ylim(0.6, 1) 

linear regression with hpd


Solution

  • I have answered the question on PyMC3 discourse, please refer there for a more detailed answer.

    I am sharing part of the answer here too for completeness:

    There are a couple of small modifications to the code that should fix the problem. However, I'd recommend taking advantage of ArviZ and xarray as it is shown in this notebook.

    ...
    
    for i in range(groups):
        
        alpha = trace_rsp_rt['α'][:, i]
        beta = trace_rsp_rt['β'][:, i]
        mu = alpha + beta * rt  
        # there may be broadcasting issues requiring to use rt[None, :]
        # xarray would handle broadcasting automatically ass seen in the notebook
        
        ax_rsp_rt[i].plot(rt, mu.mean(), c='k', label= f'rsp = {alpha:.2f} + {beta:.2f} * rt')
        az.plot_hpd(rt, mu, credible_interval=0.98, color='k', ax=ax_rsp_rt[i])
        ax_rsp_rt[i].legend()
        # combining pyplot and object based commands can yield unexpected results
        ax.set_xlim(1.2, 1.8)  
        ax.set_ylim(0.6, 1)