Search code examples
rregexreadxl

Find cell with grepl & which


I'm trying to find a cell in a data.frame that starts with a given pattern (I need to search rows AND columns). The data.frames are coming from excel workbooks being read by readxl package. The location of the pattern varies by workbook. So in workbook 1 it might be in cell B12 but in workbook 2 it will be in C12. However, once I find the 'anchor' cell I can look 1 cell to the right and find the numeric value I care about.

Currently I'm using this:

my_data <- iris
my_data$Species <- as.character(my_data$Species)
my_data[99,5] <- 'my_string'
my_data[109,5] <- 'my_string_too'

target_val <- 'my_string'
which(my_data == target_val, arr.ind = T)

This is good for exact matches but I'd like to get the functionality of grepl to use things like 'starts with' or 'OR' but can't figure out how to combine the two. Is there a way to do something like this:

which(my_data[grepl('my_string|my_string_too', my_data, ignore.case = T)], arr.ind = T)

Solution

  • You can use sapply() which will output a matrix (in this case) and then use which() as you were using it before. Of course, this method extends to use with %in%, startsWith() or endsWith(), or using grepl() with fixed = TRUE if you are doing literal string matching.

    which(sapply(my_data, function(x) grepl("^my_string", x)), arr.ind = TRUE)
    
         row col
    [1,]  99   5
    [2,] 109   5