Search code examples
rjagstweedie

How do you code a Tweedie distribution in JAGS/BUGS?


I'd like to run a model on a Tweedie distributed variable using JAGS through R. I know that JAGS doesn't have a Tweedie distribution as standard, but that it is possible to specify one as a compound Gamma/Poisson. Unfortunately I can't figure out how to code it in JAGS. I wrote the below based on code scavenged from various sources to simply try and recover the mean, power and phi parameters from a Tweedie random variable. It doesn't run at the moment because of invalid parent values on y, presumably because y[i] appears on the right hand side and left hand side of an expression. This is as it was written in the source code, but I'm evidently misusing it. Any pointers on how to properly specify this distribution would be much appreciated and probably of wider use as I haven't been able to find any simple coded on examples on how to set up Tweedie models in JAGS.

y = mgcv::rTweedie(mu=rep(2,100),p=1.33,phi=1)

jags_data = list(y=y,n=length(y))

jags_model = 
 "model{
    
    for (i in 1:n) {
      lambda[i] <- pow(mu,2-p)/(phi *(2-p))
      num[i] ~ dpois(lambda[i])
      shape[i,1] <- num[i]*((2-p)/(p-1))
      rate[i,1] <- 1/(phi*(p-1)*pow(mu,p-1))
      shape[i,2] <- 1
      rate[i,2] <- exp(-lambda[i])
      # Takes shape/rate parameter 1 if y > 0 and 2 if y = 0
      y[i] ~ dgamma(shape[i,1+equals(y[i],0)],rate[i,1+equals(y[i],0)]) 
    }
    
    mu    ~ dunif(0,100)
    p     ~ dunif(1,2)  ## Tweedie power parameter
    phi   ~ dunif(0,30) ## Dispersion parameter
    
  }
  
  "

model_file = tempfile(fileext = 'txt')
writeLines(jags_model,model_file)

jm = rjags::jags.model(
  file = model_file,
  data = jags_data,
  n.chains = 3,
  n.adapt = 1500
)


Solution

  • I was able to get this to function. Whether it "worked" or not seems like more of an open question, but the posterior means of the parameters are pretty close to the true values. I did this in R with runjags, but all of the pieces are here to be done in JAGS independent of R.

    First, I generated the data and also put the shape matrix in the data with a column of NA values so they could be overwritten in each iteration of the simulation. I also added a variable called yind which is 2 if y == 0 and 1 otherwise. This will serve as the value that will index the shape and rate matrices. It is probably more efficient to do this in the data rather than have JAGS evaluate and equals() function a few times in every iteration for every value of y which are fixed at the beginning.

    y = mgcv::rTweedie(mu=rep(2,100),p=1.33,phi=1)
    shape_mat <- matrix(NA, nrow=length(y), ncol=2)
    shape_mat[,2] <- 1
    jags_data = list(y=y,n=length(y), yind = 2-(y > 0), shape = shape_mat)
    

    Next, the model is the same save the change in how the shape matrix is dealt with and the addition of yind.

    jags_model = 
      "model{
        
        for (i in 1:n) {
          lambda[i] <- pow(mu,2-p)/(phi *(2-p))
          num[i] ~ dpois(lambda[i])
          shape[i,1] <- num[i]*((2-p)/(p-1))
          rate[i,1] <- 1/(phi*(p-1)*pow(mu,p-1))
          ## moved to data
          # shape[i,2] <- 1
          rate[i,2] <- exp(-lambda[i])
          # Takes shape/rate parameter 1 if y > 0 and 2 if y = 0
          y[i] ~ dgamma(shape[i,yind[i]],rate[i,yind[i]]) 
        }
        
        mu    ~ dunif(0,100)
        p     ~ dunif(1,2)  ## Tweedie power parameter
        phi   ~ dunif(0,30) ## Dispersion parameter
        
      }
      "
    

    The invalid parent values likely came from the initial values. The num poisson draws have to be consistent with lambda, which is a function of mu, p and phi. So, the initial values are important to ensure that these are all consistent. I set the three model parameters to the values below and then calculated lambda. I set num to the nearest integer value to lambda based on the values of these three parameters.

    inits <- list(mu = 5, p=1.5, phi=5,  
                  num = rep(1, length(y)))
    

    Next, I ran the model and summarized:

    library(runjags)
    out <- run.jags(model=jags_model, data = jags_data, monitor=c("mu", "p", "phi"), burnin=5000, sample=10000, 
                    inits = list(inits, inits), n.chains = 2, keep.jags.files=TRUE)
    summary(out)
    # Lower95   Median Upper95     Mean         SD Mode       MCerr MC%ofSD SSeff        AC.10     psrf
    # mu  1.606100 1.921855 2.24793 1.927986 0.16489395   NA 0.001534324     0.9 11550 -0.006514202 1.000242
    # p   1.252230 1.377415 1.50807 1.379773 0.06563517   NA 0.002049511     3.1  1026  0.354243967 1.013941
    # phi 0.871354 1.098075 1.36870 1.109978 0.12874006   NA 0.003880090     3.0  1101  0.322202621 1.004731