Similar to this question, but for column index instead of row: How do you find the index of a row that has a certain partial string?
I'm scraping data from PDFs, and tables on different pages have similar information in different alignments. Piece of information "A" may be in Column 2 Row 3 in Table A, and Column 1 Row 4 in Table B.
As indicated above, we can capture the row index using:
df <- data.frame(
v1 = c("a", "b", "c", "d", "e"),
v2 = c("x", "y", "a", "z", "f"),
v3 = c("x", "y", "a", "z", "s"),
v4 = c("w", "c", "q", "p", "r")
)
library(tidyverse)
df %>%
mutate(ind = row_number()) %>%
filter(if_any(everything(), ~ str_detect(., 'a'))) %>%
pull(ind)
# [1] 1 3
But for my (admittedly very unusual) case where I don't know the column, how can I capture the column index (e.g., when searching for "a" in the above example, should be 1, 2, 3)?
A base R solution using colSums
:
search_term <- "a"
which(colSums(df == search_term) > 0)
#v1 v2 v3
# 1 2 3
The above does an exact match using ==
, if you are looking for partial match you may use grepl
.
which(colSums(sapply(df, grepl, pattern = search_term)) > 0)