Search code examples
pythonbayesianpymc3

Simulate a prior from a linear model using pyMC3 without specifying distribution for the output


I have a linear model of a system, for which I don't currently have any data. It could be of a form Y = a + b + c. I would like to simulate likely values of Y based on assumptions about the distribution of the input parameters. It could be that a is distributed as Normal(mu=-1, sd=0.1), b as Normal(mu=15.0, sd=2.0) and c is a constant. I have tried to implement such a simulation using pyMC3 as follows:

import arviz as az
import matplotlib.pyplot as plt
import pymc3 as pm

c = 2.0

with pm.Model() as my_model:
    a = pm.Normal("a", -1, 0.1)
    b = pm.Normal("b", 15.0, 2.0)
    
    Y_mu = a + b + c
    Y_sigma = pm.Normal('Y_sigma', mu=1, sd=1)
    
    Y = pm.Normal("Y", mu=Y_mu, sigma = Y_sigma)
    
    prior_checks = pm.sample_prior_predictive(samples=10000, random_seed=123)
    
fig, axes = plt.subplots(ncols=1, nrows = len(prior_checks))

for i, (key, value) in enumerate(prior_checks.items()):
    axes[i].set_title(key)
    az.plot_kde(value, ax=axes[i])

Distribution of variables

I have been unable to find an approach, where I can avoid making an assumption about the distribution of Y, e.g. I cannot get values for what I define as Y_mu above. I would like to not assume a standard deviation for Y and just see what values are generated for Y = a + b + c.

Am I approaching this the right way? Or am I missing a simple detail, and overcomplicating things?


Solution

  • You have to use pm.Deterministic to store Y_mu as a variable in the trace:

    c = 2.0
    
    with pm.Model() as my_model:
        a = pm.Normal("a", -1, 0.1)
        b = pm.Normal("b", 15.0, 2.0)
        
        Y_mu = pm.Deterministic("Y_mu", a + b + c)
        Y_sigma = pm.Normal('Y_sigma', mu=1, sd=1)
        
        Y = pm.Normal("Y", mu=Y_mu, sigma = Y_sigma)
        
        prior_checks = pm.sample_prior_predictive(samples=10000, random_seed=123)