Search code examples
rtraminersequence-analysis

Convert long data.frame to sequence in TraMineR


I have a data.frame in long format, that I want to convert to a TraMineR sequence object.

set.seed(1)
df <- data.frame(year = rep(1990:2010, 3),
                 id = rep(1:3, each = 21),
                 value = sample(10, 63, replace = TRUE))

AFAIK, none of the formats described in the manual support this format.

What would be an easy way to convert this data.frame to a sequence object? With id the individual, year the time and value the state. One can convert first to wide (see answer), but I wonder whether this format is natively supported by TraMineR.


Solution

  • TraMineR indeed prefers data in wide format but also can handle data in different formats by utilizing its seqformat function. You either could reshape the data by explicitly calling seqformat(as recommended in the manual) or within the seqdef call. Data stored in the long format could be conceived as a special type of SPELL data in which each spell is of length 1.

    # Option 1
    seqformat(df,
              from = "SPELL", to = "STS",
              id = "id", 
              begin = "year", end = "year", 
              status = "value",
              process = FALSE, limit = 21) |> 
      seqdef()
    
    # Option 2
    seqdef(df,
           var = c("id", "year", "year", "value"), # start and end date = year
           informat = "SPELL",
           process = FALSE)