Search code examples
rshinytmquanteda

Display matching sentences by text typed in a Shiny app text box


I am trying to build an Shiny App that can dynamically display sentences from a database column by matching a Corpus from a text box , ie. as users starts typing the text in the text box, all the sentences that would match (corpus from the text typed) need to be displayed by order of number of words that that matchs the corpus I tried kwic function but that is not helping match corpus dynamically, approach that I tried,

require(quanteda)
require(tm)
data(crude, package = "tm")
mycorpus <- corpus(crude)

kwic(mycorpus, "company") # Pass the words from the text box corpus

request help...


Solution

  • I think what you're asking for is,

    table(kwic(mycorpus, phrase, join = FALSE)$keyword)
    

    where phrase just gets lengthened as more terms get typed in. (Requires quanteda >= 0.99, which also includes the phrase function which might be useful here.) For a more general match, you could convert both the corpus and all entered terms (in an ever-lengthening phrase) into tokenized wordstems

    mystems <- corpus(crude) %>% texts() %>% tokens() %>% tokens_wordstem()
    phrase <- tokens(phrase, remove_punct = TRUE, remove_symbols = TRUE) %>%
        tokens_wordstem(language = "greek") %>% # or whatever
        as.character()
    

    Then table(kwic(mystems, phrase, join = FALSE)$keyword) should do same thing but matching word stems only, rather than exact words. If you want numbers of words that match each document, then a *apply-type wrapper (or purrr::map()) will also extract that.