Search code examples
rsubsettrigram

How to select columns by values in a row in R


I have a large data frame marking occurrences of trigrams in a string, where the strings are the rows, the trigrams are the columns, and the values mark whether an trigram occurs in a string.

so something like this:

strs <- c('this', 'that', 'chat', 'chin')
thi <- c(1, 0, 0, 0)
tha <- c(0, 1, 0, 0)
hin <- c(0, 0, 0, 1)
hat <- c(0, 1, 1, 0)
df <- data.frame(strs, thi, tha, hin, hat)
df

#  strs thi tha hin hat
#1 this   1   0   0   0
#2 that   0   1   0   1
#3 chat   0   0   0   1
#4 chin   0   0   1   0

I want to get all of the columns/trigrams that have a 1 for a given row or a given string.

So for row 2, the string 'that', the result would a data frame that looks like this:

  str tha hat
1 this  0   0
2 that  1   1
3 chat  0   1
4 chin  0   0

How could I do this?


Solution

  • This will give you the desired output df.

    givenStr <- "that"
    row <- df[df$strs==givenStr,]
    df[,c(1,1+which(row[,-1]==1))]