Search code examples
rsparse-matrixcluster-analysis

How to create a binary matrix of inventory per row? (R)


I have a dataframe of 9 columns consisting of an inventory of factors. Each row can have all 9 columns filled (as in that row is holding 9 "things"), but most don't (most have between 3-4). The columns aren't specific either, as in if item 200 shows up in columns 1 and 3, it's the same thing. I'd like to create a matrix that is binary for each row that includes all factors.

Ex (shortened to 4 columns just to get point across)

R1 3  4   5   8
R2 4  6   7   NA
R3 1  5  NA   NA
R4 2  6   8   9

Should turn into

     1  2  3  4  5  6  7  8  9 
r1   0  0  1  1  1  0  0  1  0
r2   0  0  0  1  0  1  1  0  0
r3   1  0  0  0  1  0  0  0  0
r4   0  1  0  0  0  1  0  1  1

I've looked into writeBin/readBin, K-clustering (which is something I'd like to do, but I need to get rid of the NAs first), fuzzy clustering, tag clustering. Just kinda lost about what direction to go.

I've tried writing two for loops that pull the data from the matrix by column/row and then save 0s and 1s respectively in a new matrix, but I think there were scope issues.

You guys are the best. Thanks!


Solution

  • Here's a base R solution:

    # Read in the data, and convert to matrix form
    df <- read.table(text = "
    3  4   5   8
    4  6   7   NA
    1  5  NA   NA
    2  6   8   9", header = FALSE)
    m <- as.matrix(df)
    
    # Create a two column matrix containing row/column indices of cells to be filled 
    # with 'one's
    id <- cbind(rowid = as.vector(t(row(m))), 
                colid = as.vector(t(m)))
    id <- id[complete.cases(id), ]
    
    # Create output matrix
    out <-  matrix(0, nrow = nrow(m), ncol = max(m, na.rm = TRUE))
    out[id] <- 1
    #      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
    # [1,]    0    0    1    1    1    0    0    1    0
    # [2,]    0    0    0    1    0    1    1    0    0
    # [3,]    1    0    0    0    1    0    0    0    0
    # [4,]    0    1    0    0    0    1    0    1    1