Search code examples
rdataframedplyrcbind

Delete the same value in each column of the dataframe and specify the positions of the values in the given columns


FIRST: I need some hint how to do it fastest, because I want to apply it many times to the dataframe with a lot of rows. I want to delete the same value in each column of the dataframe. Each column of the dataframe is permutation of a given factor without replacement.

For example I remove value "1" from each column:

column<-1:20
cbind(sample(column))
data <- matrix(column , length(column) , 5)
data<-apply(data,2, sample)
for (n in 1:length(data[1, ])) {
  data[, n]<-c(data[-which(data[,n]==1), n], 1)
}
data <- data[-length(data[,1]),]

SECOND: I want to specify the positions of the values in the given columns relative to the first column.

pos <- function(data){
  Position <- match(data[,1],data[,1])
  Position <- as.data.frame(Position)
  for (i in 2:length(data[1,])) {
    Position <- cbind(Position, match(data[,1],data[,i]))
  }
  return(Position)
}

If you have any faster suggestions please feel free to mention them below.


Solution

  • A vectorized function:

     structure(data[data!=1],.Dim=dim(data)-c(1,0))
    

    To be able to match we can use:

     data1 = matrix(data[,1],nrow(data),ncol(data))
    
     array(pmatch(data1,data),dim(data))-(col(data)-1)*nrow(data)