Search code examples
rdataframefiltersubsetrowwise

Subset or filter data.frame per indices e.g. column-wise per row


Assume you have such a data.frame:

df <- data.frame(matrix(1:12, 4))
df
  X1 X2 X3
1  1  5  9
2  2  6 10
3  3  7 11
4  4  8 12

which have to be filtered row-wise by these column indices:

b=c(2,1,3,2)

So the expected output should be this:

c(5, 2, 11, 8)

Using following approach is not the solution, obviously.

df[ 1:nrow(df), b] 

So far I'm using an approach with mapply which is working:

mapply(function(x, y)  x[y], as.data.frame(t(df)), b, USE.NAMES = F)
[1]  5  2 11  8

But I'm wondering whether there is a more elegant solution out there?


Solution

  • You can use numeric matrix indexing; check ?"[" under the section Matrices and arrays:

    A third form of indexing is via a numeric matrix with the one column for each dimension: each row of the index matrix then selects a single element of the array, and the result is a vector. Negative indices are not allowed in the index matrix. NA and zero values are allowed: rows of an index matrix containing a zero are ignored, whereas rows containing an NA produce an NA in the result.

    The original data frame has 2 dimensions, so you can construct an index matrix with two columns, the first column will represent the row index, and the second column will represent the column index, each pair extracts one element from the data frame as stated in the documentation:

    b=c(2,1,3,2)
    
    df[cbind(seq_len(nrow(df)), b)]
    # [1]  5  2 11  8