I have a question: is there a R function to automatically code binary variables as factors?
I have a tibble with over 80 variables (columns), many of which are of a boolean nature (0, 1 and NAs) that R imported as numeric. As I would like to avoid transforming them manually into factors, I wondered if there was a function capable of automatically detect binary numeric variables in a data.frame
(or a tibble
) and change them into factors? I could create such a function myself, but if it already exists, why bother?
Below we assume that a column is regarded as binary as long as
Note that a column which is entirely 0 and NA or entirely 1 and NA is regarded as binary but if that is undesirable we show how to change the code to require that binary columns have both 0 and 1.
First define a function is_binary
that defines whether a column is to be regarded as binary or not. This function can be changed if you want to change the definition of binary. In particular change 1:2 to 2 in the code below if a column must have both 0 and 1 in order to consider it as binary. Other definitions are possible if needed.
Next apply is_binary
to each column returning a logical vector ok
with one component per column that is TRUE if that column is binary or FALSE otherwise.
In the line computing the answer DF2
we apply factor
to each binary column using the argument levels = 0:1
to ensure that columns that only have 0's or only have 1's still have both levels.
No packages are used.
DF <- data.frame(a = c(0:1, NA), b = 1:3, c = NA, d = 0) # test data frame
is_binary <- function(x) {
x0 <- na.omit(x)
is.numeric(x) && length(unique(x0)) %in% 1:2 && all(x0 %in% 0:1)
}
ok <- sapply(DF, is_binary)
DF2 <- replace(DF, ok, lapply(DF[ok], factor, levels = 0:1))
str(DF2)
## 'data.frame': 3 obs. of 4 variables:
## $ a: Factor w/ 2 levels "0","1": 1 2 NA
## $ b: int 1 2 3
## $ c: logi NA NA NA
## $ d: Factor w/ 2 levels "0","1": 1 1 1
We could alternately use dplyr with is_binary
like this:
DF %>% mutate(across(where(is_binary), ~ factor(., levels = 0:1)))