I am trying to generate a sparse matrix in R to represent some dummy-coded variables. Thus, the matrix should have exactly one '1' per row (all other values being zero). So, something like this:
0 0 1 0
1 0 0 0
0 1 0 0
0 0 0 1
Is there a reasonable way to generate such a matrix? The best thing I can come up with is to create j vectors representing each possible line and then sample from those; but that seems a little kludgy. Any better suggestions?
Edit: Here is what I ultimately did; indeed sampled from a list of vectors. The solutions below are, I guess, superior, especially for scaling.
matrix(unlist(sample(list(c(1, 0, 0, 0), c(0, 1, 0, 0), c(0, 0, 1, 0), c(0, 0, 0, 1)),
size=93, replace=TRUE)), 93, 4, byrow=TRUE)
If you wanted to create a random dummy matrix, a quick way would be to create a function like this:
Dummy <- function(nrow, ncol) {
M <- matrix(0L, nrow = nrow, ncol = ncol)
M[cbind(sequence(nrow), sample(ncol, nrow, TRUE))] <- 1L
M
}
The first line of the function just creates an empty matrix of zeroes. The second line uses matrix indexing to replace exactly one value per row with a one. The third line just returns the output. I'm not sure how you were planning on creating/using your j vectors, but this is how I would suggest approaching it....
Usage is simple: You just need to specify the number of rows and the number of columns that the final matrix should have.
Example:
set.seed(1) ## for reproducibility
Dummy(3, 3)
# [,1] [,2] [,3]
# [1,] 1 0 0
# [2,] 0 1 0
# [3,] 0 1 0
Dummy(6, 4)
# [,1] [,2] [,3] [,4]
# [1,] 0 0 0 1
# [2,] 1 0 0 0
# [3,] 0 0 0 1
# [4,] 0 0 0 1
# [5,] 0 0 1 0
# [6,] 0 0 1 0