I would like to make a sequence index plots of a multi-channel sequence object for first descriptive purposes. However, I am still not sure how to do that properly. The usual way of sorting one sequence object does not work well as there is no nested sorting function which sorts across multiple channels. The best way seems to me, to do a MDS after calculating multi-channel sequence distances using seqdistmc and sort all channels accordingly. This approach necessitates multiple decisions regarding the distance measure and so on, thus it almost goes beyond my intention of a first description.
seqHMM
, see below.Here is some R syntax which may help to understand my problem
library(TraMineR)
library(TraMineRextras)
# Building sequence objects
data(biofam)
## Building one channel per type of event left, children or married
bf <- as.matrix(biofam[, 10:25])
children <- bf==4 | bf==5 | bf==6
married <- bf == 2 | bf== 3 | bf==6
left <- bf==1 | bf==3 | bf==5 | bf==6
child.seq <- seqdef(children)
marr.seq <- seqdef(married)
left.seq <- seqdef(left)
# Create unsorted Sequence Index Plots
layout(matrix(c(1,2,3,4,4,4), 2, 3, byrow=TRUE), heights=c(3,1.5))
seqIplot(child.seq, title="Children", withlegend=FALSE)
seqIplot(marr.seq, title="Married", withlegend=FALSE)
seqIplot(left.seq, title="Left parents", withlegend=FALSE)
seqlegend(child.seq, horiz=TRUE, position="top", xpd=TRUE)
# Create sequence Index Plots sorted by alignment of first channel from beginning
mcsort.ch1 <- sortv(child.seq, start="beg")
layout(matrix(c(1,2,3,4,4,4), 2, 3, byrow=TRUE), heights=c(3,1.5))
seqIplot(child.seq, title="Children", sortv=mcsort.ch1, withlegend=FALSE)
seqIplot(marr.seq, title="Married", sortv=mcsort.ch1, withlegend=FALSE)
seqIplot(left.seq, title="Left parents", sortv=mcsort.ch1, withlegend=FALSE)
seqlegend(child.seq, horiz=TRUE, position="top", xpd=TRUE)
# Sequence Index Plots sorted by MDS scores of multi-channel distances
## Calculate multi-channel distances and MDS scores
mcdist <- seqdistmc(channels=list(child.seq, marr.seq, left.seq),
method="OM", sm =list("TRATE", "TRATE", "TRATE"))
mcsort.mds <- cmdscale(mcdist, k=2, eig=TRUE)
## Create sequence Index Plots sorted by MDS scores of multi-channel distances
layout(matrix(c(1,2,3,4,4,4), 2, 3, byrow=TRUE), heights=c(3,1.5))
seqIplot(child.seq, title="Children", sortv=mcsort.mds$points[,1], withlegend=FALSE)
seqIplot(marr.seq, title="Married", sortv=mcsort.mds$points[,1], withlegend=FALSE)
seqIplot(left.seq, title="Left parents", sortv=mcsort.mds$points[,1], withlegend=FALSE)
seqlegend(child.seq, horiz=TRUE, position="top", xpd=TRUE)
I just stumbled upon the package seqHMM
whose name suggests other purposes but which is able to sort multi-channel sequence objects. Thus, seqHMM
is an answer to my first question.
Here is an example code using seqHMM
:
library(TraMineR)
library(seqHMM)
# Building sequence objects
data(biofam)
## Building one channel per type of event left, children or married
bf <- as.matrix(biofam[, 10:25])
children <- bf==4 | bf==5 | bf==6
married <- bf == 2 | bf== 3 | bf==6
left <- bf==1 | bf==3 | bf==5 | bf==6
child.seq <- seqdef(children)
marr.seq <- seqdef(married)
left.seq <- seqdef(left)
mcplot <- ssp(list(child.seq, marr.seq, left.seq),
type = "I",title = "Sequence index plots",
sortv = "from.start", sort.channel = 1,
withlegend = FALSE, ylab.pos = c(1, 1.5, 1),
ylab = c("Parenthood", "Marriage", "Residence"))
plot(mcplot)
You can find more about the abilities of the package in the following paper:
Helske, Satu; Helske, Jouni (2016): Mixture Hidden Markov Models for Sequence Data: the seqHMM Package in R. Jyväskylä. Available online at https://cran.r-project.org/web/packages/seqHMM/vignettes/seqHMM.pdf.