Search code examples
rdataframetextshinytextinput

Manipulating textInput in R Shiny


I am relatively new to R and even more new to Shiny (literally first day).

I would like a user to input multiple phrases separated by a comma such as female, aged, diabetes mellitus. I have a dataframe in which one variable, MH2 contains text words. I would like to output a dataframe that contains only the rows in which all of the inputted phrases are present. Sometimes a user may input only one phrase, other times 5.

This is my ui.R

library(shiny)
library(stringr)

# load dataset
load(file = "./data/all_cardiovascular_case_reports.Rdata")

ui <- fluidPage(
  sidebarLayout(
    sidebarPanel(
      textInput(inputId = "phrases", 
                label = "Please enter all the MeSH terms that you would like to search, each separated by a comma:",
                value = ""),

      helpText("Example: female, aged, diabetes mellitus")

    ),
    mainPanel(DT::dataTableOutput("dataframe"))
  )
)

and here is my server.R

library(shiny)

server <- function(input, output)
{
  # where all the code will go
    df <- reactive({

      # counts how many phrases there are
      num_phrases <- str_count(input$phrases, pattern = ", ") + 1

      a <- numeric(num_phrases) # initialize vector to hold all phrases

      # create vector of all entered phrases
      for (i in 1:num_phrases)
      {
        a[i] <- noquote(strsplit(input$phrases, ", ")[[i]][1])
      }

      # make all phrases lowercase
      a <- tolower(a)

      # do exact case match so that each phrase is bound by "\\b"
      a <- paste0("\\b", a, sep = "")
      exact <- "\\b"
      a <- paste0(a, exact, sep = "")

      # subset dataframe over and over again until all phrases used
      for (i in 1:num_phrases)
      {
        final <- final[grepl(pattern = a, x = final$MH2, ignore.case = TRUE), ]
      }

      return(final)
    })

    output$dataframe <- DT::renderDataTable({df()})
}

When I tried running renderText({num_phrases}) I consistently got 1 even when I would input multiple phrases separated by commas. Since then, whenever I try to input multiple phrases, I run into "error: subscript out of bounds." However, when I enter the words separated by a comma only versus a comma and space (entering "female,aged" instead of "female, aged") then that problem disappears, but my dataframe doesn't subset correctly. It can only subset one phrase.

Please advise.

Thanks.


Solution

  • I think your Shiny logic looks good, but the function for subsetting the dataframe has a few small issues. In particular:

    a[i] <- noquote(strsplit(input$phrases, ", ")[[i]][1])

    The indices [[i]] and 1 are in the wrong place here, should be [[1]][i]

    final <- final[grepl(pattern = a, x = final$MH2, ignore.case = TRUE), ]
    

    You can not match multiple patterns like this, only the first element of a will be used, which is also the warning R gives.


    Example working code

    I have changed input$phrases to inp_phrases here. If this script does what you want I think you can easily copy it into you reactive, making the necessary changes (i.e. changing inp_phrases back, and adding the return(result) statement.). I was also not entirely clear if you wanted all patterns to be matched within one row, or return all rows were any of the patterns were matched, so I added them both, you can uncomment the one you need:

    library(stringr)
    
    # some example data
    inp_phrases = "ab, cd"
    final = data.frame(index = c(1,2,3,4),MH2 = c("ab cd ef","ab ef","cd ef ab","ef gx"),stringsAsFactors = F)
    
    # this could become just two lines:
    a <- sapply(strsplit(inp_phrases, ", ")[[1]],  function(x) tolower(noquote(x)))
    a <- paste0("\\b", a, "\\b") 
    
    # Two options here, uncomment the one you need.
    # Top one: match any pattern in a. Bottom: match all patterns in a
    # indices = grepl(pattern = paste(a,collapse="|"), x = final$MH2, ignore.case = TRUE)
    indices = colSums(do.call(rbind,lapply(a, function(x) grepl(pattern = x, x = final$MH2, ignore.case = TRUE))))==length(a)
    
    result <- final[indices,]
    

    Returns:

      index      MH2
    1     1 ab cd ef
    3     3 cd ef ab
    

    ... with the second version of indices (match all) or

      index      MH2
    1     1 ab cd ef
    2     2    ab ef
    3     3 cd ef ab
    

    ... with the first version of indices (match any)

    Hope this helps!