I have a data frame that contains two types cols and vector with names. How select some rows in data frame matches with vector strings.
name = c("p4@HPS1", "p7@HPS2", "p4@HPS3", "p7@HPS4", "p7@HPS5", "p9@HPS6", "p11@HPS7", "p10@HPS8", "p15@HPS9")
expression = c(118.84, 90.04, 106.6, 104.99, 93.2, 66.84, 90.02, 108.03, 111.83)
dataset <- as.data.frame(cbind(name, expression))
nam <- c("HPS5", "HPS6", "HPS9", "HPS2")
The function should return date frame only for the specified lines
I try
dataset[mapply(grepl,nam,dataset$name)]
but it didn't work
We can use paste
with collapse
on the 'nam', use it as pattern
argument in grep
, get the index and subset the 'dataset'
dataset[grep(paste(nam, collapse="|"), dataset$name),]
If we are using the OP's code, wrap the 'name' column inside a list
or else the mapply
will go through individual elements of 'name' and as the number elements are not the same in 'name' and 'nam', this will throw a warning about the longer argument not a multiple of length of shorter
. The mapply
will return a logical matrix from which we take the rowSums
and check whether it is greater than 0 to get a logical vector for subsetting the rows.
dataset[rowSums(mapply(grepl, nam, list(dataset$name)))>0,]