Search code examples
rdummy-variable

creating a dummy matrix from a concatenated column


I'm using R and I have a column that looks like this:

relative
aunt
mother,grandmother

sister,mother

My desired outcome should look like this:

mother  sister aunt grandmother
0       0      1    0
1       0      0    1
0       0      0    0
1       1      0    0

How can I do that? Thanks in advance.


Solution

  • You can do:

    relative <- c("aunt", "mother,grandmother", "sister,mother", "", "other")
    R <- strsplit(relative, ',')
    r <- unique(unlist(R))
    result <- t(sapply(R, function(Ri) if (length(Ri)==0) rep(FALSE, length(r)) else r %in% Ri))
    colnames(result) <- r
    result
    # > result
    #       aunt mother grandmother sister other
    # [1,]  TRUE  FALSE       FALSE  FALSE FALSE
    # [2,] FALSE   TRUE        TRUE  FALSE FALSE
    # [3,] FALSE   TRUE       FALSE   TRUE FALSE
    # [4,] FALSE  FALSE       FALSE  FALSE FALSE
    # [5,] FALSE  FALSE       FALSE  FALSE  TRUE
    

    or (for integers):

    +result
    # > +result
    #      aunt mother grandmother sister other
    # [1,]    1      0           0      0     0
    # [2,]    0      1           1      0     0
    # [3,]    0      1           0      1     0
    # [4,]    0      0           0      0     0
    # [5,]    0      0           0      0     1