Search code examples
rmatrixsparse-matrix

R: Generating a sparse matrix with exactly one value per row (dummy coding)


I am trying to generate a sparse matrix in R to represent some dummy-coded variables. Thus, the matrix should have exactly one '1' per row (all other values being zero). So, something like this:

0 0 1 0
1 0 0 0
0 1 0 0
0 0 0 1

Is there a reasonable way to generate such a matrix? The best thing I can come up with is to create j vectors representing each possible line and then sample from those; but that seems a little kludgy. Any better suggestions?

Edit: Here is what I ultimately did; indeed sampled from a list of vectors. The solutions below are, I guess, superior, especially for scaling.

matrix(unlist(sample(list(c(1, 0, 0, 0), c(0, 1, 0, 0), c(0, 0, 1, 0), c(0, 0, 0, 1)), 
                       size=93, replace=TRUE)), 93, 4, byrow=TRUE)

Solution

  • If you wanted to create a random dummy matrix, a quick way would be to create a function like this:

    Dummy <- function(nrow, ncol) {
      M <- matrix(0L, nrow = nrow, ncol = ncol)
      M[cbind(sequence(nrow), sample(ncol, nrow, TRUE))] <- 1L
      M
    }
    

    The first line of the function just creates an empty matrix of zeroes. The second line uses matrix indexing to replace exactly one value per row with a one. The third line just returns the output. I'm not sure how you were planning on creating/using your j vectors, but this is how I would suggest approaching it....

    Usage is simple: You just need to specify the number of rows and the number of columns that the final matrix should have.

    Example:

    set.seed(1) ## for reproducibility
    Dummy(3, 3)
    #      [,1] [,2] [,3]
    # [1,]    1    0    0
    # [2,]    0    1    0
    # [3,]    0    1    0
    Dummy(6, 4)
    #      [,1] [,2] [,3] [,4]
    # [1,]    0    0    0    1
    # [2,]    1    0    0    0
    # [3,]    0    0    0    1
    # [4,]    0    0    0    1
    # [5,]    0    0    1    0
    # [6,]    0    0    1    0