multinomial-hmm using R

I have the following kind of data:

df <- data.frame(id = rep(1, each = 40), 
                        year = seq(1961,2000),
                        x1 = rbinom(40, size = 1, prob = 1 - 0.6) * rpois(40, lambda = 4),
                        X2 = rbinom(40, size = 1, prob = 1 - 0.7) * rpois(40, lambda = 4),
                        X3 = rbinom(40, size = 1, prob = 1 - 0.6) * rpois(40, lambda = 5),
                        X4 = rbinom(40, size = 1, prob = 1 - 0.7) * rpois(40, lambda = 6))

As you can see in my data there are four count variables. I want to estimate a multinomial-HMM, as I expect that there is a latent variable, C, with 3 possible states that affect Pr(X_t=xt) (each vector X_t is assumed to be mutually independent conditional on the Markov chain C_t). For example, I expect that if C_t=1 we would observe a vector of X_t more like this (4,1,0,0), if C_T=2 a vector of X_t more like this (0,1,1,0) and if C=3 it is more likely to observe a vector of X_t like this (0,0,1,5).

I haven't found a package able to estimate this type of model, so currently, I am using the depmixS4 package.

library(depmixS4)
mod<-depmix(list(X1 ~ 1, X2 ~ 1, X3 ~ 1, X4 ~ 1), data=df, nstates=3,
                  family=list(poisson(), poisson(), poisson(),poison()))

However, I am not sure this would be a correct model according to my theoretical expectations. Can I use depmix differently to be more suitable to my model?

Solution

You could simply use a multinomial distribution for the response. I'm assuming that you mean to let X1, ..., X4 refer to four levels of a single categorical variable? And each of these variables then contains the count of a level in a particular year? One option is then to fit the following model:

> library(depmixS4)
Loading required package: nnet
Loading required package: MASS
Loading required package: Rsolnp
Loading required package: nlme
> set.seed(1)
> df <- data.frame(id = rep(1, each = 40), 
+                  year = seq(1961,2000),
+                  X1 = rbinom(40, size = 1, prob = 1 - 0.6) * rpois(40, lambda = 4),
+                  X2 = rbinom(40, size = 1, prob = 1 - 0.7) * rpois(40, lambda = 4),
+                  X3 = rbinom(40, size = 1, prob = 1 - 0.6) * rpois(40, lambda = 5),
+                  X4 = rbinom(40, size = 1, prob = 1 - 0.7) * rpois(40, lambda = 6))
> # matrix for single multinomial response variable
> X <- as.matrix(df[,c("X1", "X2", "X3", "X4")])
> 
> # formulate model
> mod<-depmix(X ~ 1, data=df, nstates=3,
+             family=multinomial("identity"))
> 
> # fit model
> fmod <- fit(mod)
converged at iteration 22 with logLik: -161.5714 
> 
> # show results
> summary(fmod)
Initial state probabilities model 
pr1 pr2 pr3 
  0   0   1 

Transition matrix 
        toS1  toS2  toS3
fromS1 0.290 0.355 0.356
fromS2 0.132 0.328 0.540
fromS3 0.542 0.191 0.267

Response parameters 
Resp 1 : multinomial 
    Re1.pr1 Re1.pr2 Re1.pr3 Re1.pr4
St1   0.092    0.56   0.288   0.061
St2   0.033    0.00   0.423   0.544
St3   0.608    0.00   0.392   0.000