Mixture prior not working in JAGS, only when likelihood term included

The code at the bottom will replicate the problem, just copy and paste it into R.

What I want is for the mean and precision to be (-100, 100) 30% of the time, and (200, 1000) for 70% of the time. Think of it as lined up in a, b, and p.

So 'pick' should be 1 30% of the time, and 2 70% of the time.

What actually happens is that on every iteration, pick is 2 (or 1 if the first element of p is the larger one). You can see this in the summary, where the quantiles for 'pick', 'testa', and 'testb' remain unchanged throughout. The strangest thing is that if you remove the likelihood loop, pick then works exactly as intended.

I hope this explains the problem, if not let me know. It's my first time posting so I'm bound to have messed things up.

library(rjags)
n = 10
y <- rnorm(n, 5, 10)
a = c(-100, 200)
b = c(100, 1000)
p = c(0.3, 0.7)

## Model

mod_str = "model{
    # Likelihood
    for (i in 1:n){
            y[i] ~ dnorm(mu, 10)
    }

    # ISSUE HERE: MIXTURE PRIOR
    mu ~ dnorm(a[pick], b[pick])
    pick ~ dcat(p[1:2])

    testa = a[pick]
    testb = b[pick]
}"

model = jags.model(textConnection(mod_str), data = list(y = y, n=n, a=a, b=b, p=p), n.chains=1)
update(model, 10000)
res = coda.samples(model, variable.names = c('pick', 'testa', 'testb', 'mu'), n.iter = 10000)
summary(res)

Solution

I think you are having problems for a couple of reasons. First, the data that you have supplied to the model (i.e., y) is not a mixture of normal distributions. As a result, the model itself has no need to mix. I would instead generate data something like this:

set.seed(320)

# number of samples
n <- 10

# Because it is a mixture of 2 we can just use an indicator variable.
#  here, pick (in the long run), would be '1' 30% of the time.
pick <- rbinom(n, 1, p[1])

# generate the data. b is in terms of precision so we are converting this
#  to standard deviations (which is what R wants).
y_det <- pick * rnorm(n, a[1], sqrt(1/b[1])) + (1 - pick) * rnorm(n, a[2], sqrt(1/b[2]))

# add a small amount of noise, can change to be more as necessary.
y <- rnorm(n, y_det, 1)

These data look more like what you would want to supply to a mixture model.

Following this, I would code the model up in a similar way as I did the data generation process. I want some indicator variable to jump between the two normal distributions. Thus, mu may change for each scalar in y.

mod_str = "model{
    # Likelihood
    for (i in 1:n){
            y[i] ~ dnorm(mu[i], 10)
            mu[i] <- mu_ind[i] * a_mu + (1 - mu_ind[i]) * b_mu
            mu_ind[i] ~ dbern(p[1])

    }
    a_mu ~ dnorm(a[1], b[1])
    b_mu ~ dnorm(a[2], b[2])

}"

model = jags.model(textConnection(mod_str), data = list(y = y, n=n, a=a, b=b, p=p), n.chains=1)
update(model, 10000)
res = coda.samples(model, variable.names = c('mu_ind', 'a_mu', 'b_mu'), n.iter = 10000)
summary(res)

             2.5%    25%    50%    75% 97.5%
a_mu       -100.4 -100.3 -100.2 -100.1  -100
b_mu        199.9  200.0  200.0  200.0   200
mu_ind[1]     0.0    0.0    0.0    0.0     0
mu_ind[2]     1.0    1.0    1.0    1.0     1
mu_ind[3]     0.0    0.0    0.0    0.0     0
mu_ind[4]     1.0    1.0    1.0    1.0     1
mu_ind[5]     0.0    0.0    0.0    0.0     0
mu_ind[6]     0.0    0.0    0.0    0.0     0
mu_ind[7]     1.0    1.0    1.0    1.0     1
mu_ind[8]     0.0    0.0    0.0    0.0     0
mu_ind[9]     0.0    0.0    0.0    0.0     0
mu_ind[10]    1.0    1.0    1.0    1.0     1

If you supplied more data, you would (in the long run) have the indicator variable mu_ind take the value of 1 30% of the time. If you had more than 2 distributions you could instead use dcat. Thus, an alternative and more generalized way of doing this would be (and I am borrowing heavily from this post by John Kruschke):

mod_str = "model {
  # Likelihood:
  for( i in 1 : n ) {
    y[i] ~ dnorm( mu[i] , 10 ) 
    mu[i] <- muOfpick[ pick[i] ]
    pick[i] ~ dcat( p[1:2] )
  }
  # Prior:
  for ( i in 1:2 ) {
    muOfpick[i] ~ dnorm( a[i] , b[i] )
  }
}"
model = jags.model(textConnection(mod_str), data = list(y = y, n=n, a=a, b=b, p=p), n.chains=1)
update(model, 10000)
res = coda.samples(model, variable.names = c('pick', 'muOfpick'), n.iter = 10000)
summary(res)

              2.5%    25%    50%    75% 97.5%
muOfpick[1] -100.4 -100.3 -100.2 -100.1  -100
muOfpick[2]  199.9  200.0  200.0  200.0   200
pick[1]        2.0    2.0    2.0    2.0     2
pick[2]        1.0    1.0    1.0    1.0     1
pick[3]        2.0    2.0    2.0    2.0     2
pick[4]        1.0    1.0    1.0    1.0     1
pick[5]        2.0    2.0    2.0    2.0     2
pick[6]        2.0    2.0    2.0    2.0     2
pick[7]        1.0    1.0    1.0    1.0     1
pick[8]        2.0    2.0    2.0    2.0     2
pick[9]        2.0    2.0    2.0    2.0     2
pick[10]       1.0    1.0    1.0    1.0     1

The link above includes even more priors (e.g., a Dirichlet prior on the probabilities incorporated into the Categorical distribution).