Search code examples
rtextsubsettidytext

select text from multiple combinations of text within a dataframe R


I want to subset data based on a text code that is used in numerous combinations throughout one column of a df. I checked first all the variations by creating a table.

 list <-  as.data.frame(table(EQP$col1))

enter image description here

I want to search within the dataframe for the text "EFC" (even when combined with other letters) and subset these rows so that I have a resultant dataframe that looks like this.

enter image description here

I have looked through this question here, but this does not answer the question. I have reviewed the tidytext package, but this does not seem to be the solution either.

How to Extract keywords from a Data Frame in R


Solution

  • You can simply use grepl.

    Considering your data.frame is called df and the column to subset on is col1

    df <- data.frame(
        col1 = c("eraEFC", "dfs", "asdj, aslkj", "dlja,EFC,:LJ)"),
        stringsAsFactors = F
    )
    
    df[grepl("EFC", df$col1), , drop = F]