Search code examples
rtraminer

Replacing NA values within sequences for a state code


The dataset I work with is an already made long-shape one. It includes the working states of young adults, being the alphabet either part- or full- time contract. All NA values are to be considered as another state: unemployed. Checking TramineR user's guide and seqdef() help seems like this could be possible to do directly when creating the STS object by seqdef(), as it is explained briefly in the supporting documents:

left:
the behavior for missing values appearing before the first (leftmost) valid state in each sequence. See Gabadinho et al. (2010) for more details on the options for handling missing values when defining sequence objects. By default, left missing values are treated as 'real' missing values and converted to the internal missing value code defined by the nr option. Other options are "DEL" to delete the positions containing missing values or a state code (belonging to the alphabet or not) to replace the missing values.

I have unsuccessfully tried replacing * and % values by a new state code, which is anyways treated as missing in practical terms (for instance, when plotting sequences). After examining left, right and gaps arguments it seems not to be the key either.

Could somebody give a hint how to specify the state code so NA values are actually treated as a state included in the alphabet? Thanks a lot beforehand!


Solution

  • Here is an example where left, gaps and right NAs are replaced by a new state ne (not in education). Note how we add the element ne to the alphabet.

    lab <- seqstatl(eduSTS.age)
    long.lab <- c(lab, "not in education")
    alphabet <- c(lab, "ne")
    short.lab <- c("AP", "CS", "EV", "MA", "HS", "OT", "TV", "HV", "ne")
    edu.seq <- seqdef(eduSTS.age, informat = "STS", alphabet = long.lab,
           states = short.lab, label = long.lab, missing = NA, left = "ne",
           gaps = "ne", right = "ne")
    

    Actually, as you can see in the example above, the string passed as left, gaps, or right argument should be one of the states (short labels). If this is not an existing state, you have to add it to the states, but you also need to add a corresponding element to the alphabet and, if you use it, to the long label.

    Hope this helps.