Search code examples
rtext-miningtm

How can I use some keywords to find which articles contain these keywords?


I am a new programmer for R. And I have some articles(.txt) saved in a folder. Now I can import articles in R. I have two methods and I don't know which one is much better.

Here is my code:

# 1
library(tm)   
cname <- file.path("D:/magazine_pass")
docs <- Corpus(DirSource(cname), readerControl=list(reader=readPlain))

# 2
dir.list <- list.files("D:/magazine_pass" , full.name = TRUE)
for(i in 1:length(dir.list)){
      file0 <- dir.list[i]
      s <- readLines(file0,encoding="ASCII")
      s <- sapply(s,function(row) iconv(row, "ASCII", "ASCII", sub=""))
   }

And I am also trying to use some biokeywords(ex.clean energy,wearable device) to find which articles contain these keywords. How can I do with that?

Please show me the code and simply describe it. Thanks a lot.


Solution

  • label1 = subset(docs, grepl(paste(c("clean energy","wearable device"), collapse = "|"), docs))

    This should look through your corpus and pull out any entries that contain the words inside the grepl function. The basic grep function searches files for a string pattern that matches the pattern provided. grepl returns a logical vector of TRUE/FALSE for whether patterns are matched within the function.