I am developing an app in R shiny that will allow users to run the Boruta algorithm on an uploaded dataset. It is including the outcome variable in the output. I understand it shouldn't do that but don't know how to make it stop. Here's a cut down version of my code - as you can see the iris dataset is loaded and the user selects the outcome variable before running the feature selection on the second page:
library(shiny) ## for app building
library(shinythemes) ## for formatting
library(Boruta) ## Boruta algorithm
# Define UI for demo app ----
ui <- fluidPage(
navbarPage("", collapsible = TRUE, inverse = TRUE, theme = shinytheme("cerulean"),
tabPanel("Read data",
fluidPage(
sidebarLayout(
sidebarPanel(width = 4,
# Outcome variable ----
textOutput("Outcome1"),
),
# Output: Data file ----
mainPanel(
h4("Table 1"),
helpText("Click into the table to identify the outcome variable."),
DTOutput("File1_table"),
)
))),
tabPanel("Feature selection",
fluidPage(
br(),
sidebarLayout(
sidebarPanel(
selectInput("For_Boruta", "Choose a dataset for feature selection", c("Table1","Table2"),selected = NULL),
hr(),
textOutput("recap_selections"),
actionButton("boruta", "Run Boruta"),
),
# Boruta output
mainPanel(
h4("Boruta Results"),
tableOutput("BorutaTbl"),
plotOutput("BorutaPlot"),
)
))),
))
################################
# #
# Server side logic #
# #
################################
server <- function(input, output, session) {
Table1 <- reactive({iris })
Out_var1 <- NULL
#show the iris file that has been read in
output$File1_table <- renderDT(
Table1(), options = list(pageLength = 10, lengthChange = FALSE, searching = FALSE),
selection = list(target = 'column') #allows selection of columns
)
#save chosen variable in Out_var1
Out_var1 <<- reactive({names(Table1()[input$File1_table_columns_selected])})
#report variable for table1
output$Outcome1 <- renderPrint(cat('The outcome variable is:\n\n', Out_var1()))
##############################
# Run BORUTA and show output #
##############################
#provide some text to make clear what will run
output$recap_selections <- renderText({
paste("You have chosen to run Boruta on", input$For_Boruta, "where", Out_var1(), "is the outcome variable.
If that is correct click below to run.")
})
#use the boruta button to run as defined above
observeEvent(input$boruta, {
X = eval(parse(text=paste(input$For_Boruta,"()"))) #this refers to the chosen dataset
set.seed(206)
boruta_output <- Boruta(x = X,
y = as.factor(X[[Out_var1()]]), doTrace=2)
output$BorutaTbl <- renderTable(attStats(boruta_output),rownames = TRUE)
output$BorutaPlot <- renderPlot({plot(boruta_output)})
})
}
# Create Shiny app ----
shinyApp(ui, server)
I tried removing the as.factor() but it didn't seem to make a difference - the "Species" variable still appears in the output. I'd be grateful for any advice?
On further inspection, second comment above is incorrect: the authors do not drop the classification column from the input data frame. But they also do not use the Boruta
function in a way that I would consider "obvious", even in their own examples.
I don't have the Boruta package, and I am not willing to install it (and its dependencies) simply to answer a question on SO. However, I think I can point you in the right direction.
The examples in the documentation for the Boruta
function generally start like this:
Boruta(Y~.,data=srx)->Boruta.srx
In other words, they provide a formula as the first argument. You need a dynamic formula, rather than a static one, but we can manage that. But before I show you how, I need to digress to a couple of changes I made to your example code.
As I mentioned in my first comment, nesting reactive
s is not a good idea. In my experience, it always ends badly. There are also simpler (to my mind) and easier ways of selecting an input data frame and an outcome column.
First, I modify your definition of the input widget that selects the input data frame and create a new one to select an outcome column.
selectInput(
"For_Boruta",
"Choose a dataset for feature selection",
c("iris","mtcars"),
selected = "iris"
),
selectInput(
"outcomeVar",
"Choose a column as the outcome variable",
c(),
selected = NULL
)
Note that the choices for input$outcomeVar
are undefined. That's because the available columns depend on the selected data frame.
So, in the app's server function, you need to define a reactive
that holds the input data frame
selectedData <- reactive({
if(input$For_Boruta == "iris") {
iris
} else {
mtcars
}
})
and update the choices for input$outcomeVar
whenever it changes
observeEvent(selectedData(), {
updateSelectInput(session, "outcomeVar", choices = names(selectedData()))
})
Now you can create a reactive
to contain the results of the Boruta analysis
borutaOutput <- reactive({
set.seed(206)
# Following format of first code snippet in the Boruta vignette
Boruta(
as.formula(paste0(input$outcomeVar, " ~ .")),
data = selectedData()
)
})
Note the creation of the input formula with as.formula(paste0(input$outcomeVar, " ~ ."))
and provision of the input data frame with data = selectedData()
.
Now you create your output plot and table.
output$BorutaTbl <- renderTable(
attStats(borutaOutput()),
rownames = TRUE
)
output$BorutaPlot <- renderPlot({
plot(borutaOutput())
})
As a result of these changes, there is no nesting of reactive
s in my code.
There are other consequential changes to your "debugging" widgets, but they aren't central to the main purpose of your app. I also dropped your formatting and layout options, as they are peripheral to your question.
Here's the full source of the app.
library(shiny) ## for app building
library(DT)
# Define UI for demo app ----
ui <- fluidPage(
navbarPage("", collapsible = TRUE, inverse = TRUE),
tabPanel("Read data",
fluidPage(
sidebarLayout(
sidebarPanel(width = 4,
textOutput("Outcome1"),
),
mainPanel(
h4("Table 1"),
DTOutput("File1_table"),
)
))),
tabPanel("Feature selection",
fluidPage(
br(),
sidebarLayout(
sidebarPanel(
selectInput(
"For_Boruta",
"Choose a dataset for feature selection",
c("iris","mtcars"),
selected = "iris"
),
selectInput(
"outcomeVar",
"Choose a column as the outcome variable",
c(),
selected = NULL
),
hr(),
textOutput("recap_selections"),
actionButton("boruta", "Run Boruta"),
),
mainPanel(
h4("Boruta Results"),
tableOutput("BorutaTbl"),
plotOutput("BorutaPlot"),
)
)))
)
server <- function(input, output, session) {
selectedData <- reactive({
if(input$For_Boruta == "iris") {
iris
} else {
mtcars
}
})
observeEvent(selectedData(), {
updateSelectInput(session, "outcomeVar", choices = names(selectedData()))
})
output$File1_table <- renderDT(
selectedData(), options = list(pageLength = 10, lengthChange = FALSE, searching = FALSE)
)
output$Outcome1 <- renderPrint(cat('The outcome variable is:\n\n', input$outcomeVar))
#provide some text to make clear what will run
output$recap_selections <- renderText({
paste("You have chosen to run Boruta on", input$For_Boruta, "where", input$outcomeVar, "is the outcome variable.
If that is correct click below to run.")
})
borutaOutput <- reactive({
set.seed(206)
# Following format of first code snippet in the Boruta vignette
Boruta(
as.formula(paste0(input$outcomeVar, " ~ .")),
data = selectedData()
)
})
output$BorutaTbl <- renderTable(
attStats(borutaOutput()),
rownames = TRUE
)
output$BorutaPlot <- renderPlot({
plot(borutaOutput())
})
}
shinyApp(ui, server)
Remember, I don't have the Boruta package, so I can't test the code relating to its use. However, everything else in the app works as I would expect. If the output from the Boruta
call still includes column(s) that you don't want, simply drop them within the borutaTbl
widget. Something like this should do the trick:
output$BorutaTbl <- renderTable(
attStats(borutaOutput()) %>% select(-c(<columns-to-drop>)),
rownames = TRUE
)
That's a tidyverse solution. You can use base R functions as well if you wish.