Search code examples
rselectioncontinuous

R: how to select only continuous numeric columns


This might be more of a theory question than a coding question.

I am trying to write a shiny app that will loop through the continuous numeric columns of a data frame and perform tests on those columns. The app allows for the user to upload their own data frame, so I don't know what it will look like in advance. I know I can pick only the numeric columns in the following way with the dplyr package

library(dplyr)
data <- data %>%
        select(where(is.numeric))

This works, but discrete numerical columns are kept as well. I cannot think of a good way in which to only select continuous columns.

I've thought of trying to do something like only choosing columns where the number of times the mode is repeated is less than a certain proportion of the length of the data frame. Or maybe something like the number of unique values needs to be greater than the number of times the mode is repeated. But neither of those seem like they will generalize well. And also they would not get rid of id columns.

I appreciate any ideas, thanks.


Solution

  • There is a library schoolmath with is.decimal and is.whole functions:

    library(schoolmath)
    x <- c(1, 1.5)
    any(is.decimal(x))
    TRUE
    

    So you could process to your dataframe with apply:

    decimal_cols <- apply(df, 2, function(x) any(is.decimal(x))
    

    The index values of the returned TRUEs will be the columns with decimal values.