Search code examples
rtraminer

TraMineR - standard deviation and CI for mean plot


I was wondering if there is a function in TraMineR for computing the standard deviation. I am quite interested further on to add confidence intervals to the seqmtplot (mean plot).

My sequence (small sample of my data)

seq_time = seqdef(dt)
seqmeant(seq_time) * 10 

In my particular case I need to multiply by 10 the seqmeant results.

First question, is there a build-in function to retreive the standart deviation ?

Second question, Is there a way to add CI to the

seqmtplot(seq_time)

enter image description here

Thanks a lot.

The sample

   dt = structure(c("e nuclear and acquaintance", "d nuclear", "d nuclear", 
"d nuclear", "e nuclear and acquaintance", "j work study sleep", 
"d nuclear", "a alone", "j work study sleep", "c child", "e nuclear and acquaintance", 
"d nuclear", "d nuclear", "d nuclear", "e nuclear and acquaintance", 
"j work study sleep", "d nuclear", "a alone", "j work study sleep", 
"c child", "e nuclear and acquaintance", "d nuclear", "d nuclear", 
"d nuclear", "e nuclear and acquaintance", "j work study sleep", 
"d nuclear", "a alone", "j work study sleep", "c child", "d nuclear", 
"d nuclear", "d nuclear", "d nuclear", "e nuclear and acquaintance", 
"j work study sleep", "d nuclear", "c child", "j work study sleep", 
"c child", "d nuclear", "d nuclear", "d nuclear", "d nuclear", 
"e nuclear and acquaintance", "j work study sleep", "d nuclear", 
"c child", "j work study sleep", "c child", "d nuclear", "d nuclear", 
"d nuclear", "d nuclear", "e nuclear and acquaintance", "a alone", 
"d nuclear", "c child", "j work study sleep", "c child", "d nuclear", 
"d nuclear", "d nuclear", "d nuclear", "e nuclear and acquaintance", 
"a alone", "d nuclear", "c child", "j work study sleep", "c child", 
"d nuclear", "d nuclear", "d nuclear", "d nuclear", "e nuclear and acquaintance", 
"b partner", "d nuclear", "c child", "j work study sleep", "c child", 
"d nuclear", "d nuclear", "d nuclear", "d nuclear", "e nuclear and acquaintance", 
 "d nuclear", "d nuclear", "c child", "j work study sleep", "c child", 
"d nuclear", "d nuclear", "e nuclear and acquaintance", "e nuclear and acquaintance", 
"e nuclear and acquaintance", "d nuclear", "d nuclear", "c child", 
"j work study sleep", "c child", "d nuclear", "d nuclear", "e nuclear and acquaintance", 
"e nuclear and acquaintance", "e nuclear and acquaintance", "d nuclear", 
 "d nuclear", "d nuclear", "j work study sleep", "c child"), .Dim = 10:11, .Dimnames = list(
c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"), c("10:30", 
"10:40", "10:50", "11:00", "11:10", "11:20", "11:30", "11:40", 
"11:50", "12:00", "12:10")))

Solution

  • There is currently no built in function in TraMineR to get the standard deviation of the time spent in states. Here is how you can compute this standard deviation and the standard error of the mean time spent in the different states

    dt.statd <- seqistatd(dt_time)
    
    mt <- apply(dt.statd, 2, mean)
    sdt <- apply(dt.statd, 2, sd)
    se.mt <- sdt/sqrt(nrow(dt.statd))
    ## using 2*se.mt as approximate error margins
    mtime <- data.frame(mean = mt, sdev = sdt, se.mean = se.mt, ci.L=mt-2*se.mt, ci.U=mt+2*se.mt)
    
    round(mtime, 3)
    

    If you have weighted sequences, you will have to consider weighted means and standard deviation. Here is an example with the mvad data

    library(TraMineR)
    data(mvad)
    mvad.lab <- c("employment", "further education", "higher education",
                  "joblessness", "school", "training")
    mvad.shortlab <- c("EM", "FE", "HE", "JL", "SC", "TR")
    mvad.seq <- seqdef(mvad[, 17:86], states = mvad.shortlab,
                       labels = mvad.lab, weights = mvad$weight, xtstep = 6)
    
    ## state distribution in each sequence
    mvad.statd <- seqistatd(mvad.seq)
    
    library(SDMTools) ## for weighted mean and std
    mt <- apply(mvad.statd, 2, wt.mean, w = mvad$weight)
    sdt <- apply(mvad.statd, 2, wt.sd, w = mvad$weight)
    se.mt <- sdt/sqrt(sum(mvad$weight))
    mtime <- data.frame(mean = mt, sdev = sdt, se.mean = se.mt)
    round(mtime, 3)
    

    To add error bars on the plot, consider the errbar function from the Hmisc package.

    Hope this helps.

    ============

    The computation of the standard errors of the meant times and the plot of the error bars is available in TraMineR since version 1.8-11. See the help pages of functions seqmeant and plot.stslist.meant.