Search code examples
rdataframematrixsample

R: Sample n elements in certain columns in a dataframe/matrix and replace their values


I am struggling to solve the captioned problem.

My dataframe is like:

  X1 X2 X3 X4 X5

1 1  2  3  4  5

2 6  7  8  9 10

3 11 12 13 14 15 

What I am trying to do is randomly selecting 3 elements from the third and fourth column and replace their values by 0. So the manipulated dataframe could be like

 X1 X2 X3 X4 X5

1 1  2  3  4  5

2 6  7  0  0 10

3 11 12 13 0 15 

I saw from here Random number selection from a data-frame that it could be easier if I convert the data frame into matrix, so I tried

mat <- data.frame(rbind(rep(1:5, 1), rep(6:10, 1), rep(11:15, 1)))
mat_matrix <- as.matrix(mat)
mat_matrix[sample(mat_matrix[, 3:4], 3)] <- 0 

But it just randomly picked 3 elements across all columns and rows in the matrix and turned them into 0.

Can anyone help me out?


Solution

  • Nothing wrong with a for loop in this case. Perhaps like this:

    
    mat <- data.frame(rbind(rep(1:5, 1), rep(6:10, 1), rep(11:15, 1)))
    cols <- c(3,4)
    
    n <- nrow(mat)*length(cols)
    v <- sample( x=1:n, size=3 )
    m <- matrix(FALSE, ncol=length(cols), nrow=nrow(mat))
    m[v] <- TRUE
    
    for( i in seq_along(cols) ) {
        mat[ m[,i], cols[i] ] <- 0
    }
    

    Just create a two column "index matrix" that you sample on and use to replace back into your data.