Search code examples
rintegersubsetnumericdummy-variable

check if all values in data.frame columns are integers to subset dummy variables aka are all values in a column TRUE?


I would like to know if there is a simpler way of subsetting a data.frame`s integer columns.

My goal is to modify the numerical columns in my data.frame without touching the purely integer columns (in my case containing 0 or 1). The integer columns were originally factor levels turned into dummy variables and should stay as they are. So I want to temporarily remove them.

To distinguish numerical from integer columns I used the OP's version from here (Check if the number is integer).

But is.wholenumber returns a matrix of TRUE/FALSE instead of one value per column like is.numeric, therefore sapply(mtcars, is.wholenumber) does not help me. I came up with the following solution, but I thought there must be an easier way?

data(mtcars)
is.wholenumber <- function(x, tol = .Machine$double.eps^0.5)  abs(x - round(x)) < tol
integer_column_names <-  apply(is.wholenumber(mtcars), 2, mean) == 1
numeric_df <- mtcars[, !integer_column_names]


Solution

  • You can use dplyr to achieve that as shown here

    library(dplyr)
    
    is_whole <- function(x) all(floor(x) == x)
    
    df = select_if(mtcars, is_whole)
    

    or in base R

    df = mtcars[ ,sapply(mtcars, is_whole)]