I am trying to use a logical vector to 'tell' sapply which columns to make numeric in my dataset.
In my data, there are NAs but all the variables are either numeric or character. I'm taking the first complete case (hard code below, but would love suggestions!) and making a logical vector based on if the first character in the string is numeric or a letter. I would like to use that logical vector to tell sapply which columns to make numeric.
#make data frame, this should return an all 'character' data frame
color <- c("red", "blue", "yellow")
number <- c(NA, 1, 3)
other.number <- c(4, 5, 7)
df <- cbind(color, number, other.number) %>% as.data.frame()
#get the first character of the variables in the first complete case
temp <- sapply(df, function(x) substr(x, 1, 1)) %>% as.data.frame() %>%
.[2,] %>% # hard code, this is the first 'complete case'
gather() %>%
#make the logical variable, which can be used as a vector
mutate(vec= ifelse(value %in% letters, FALSE, TRUE)) # apply this vector to sapply + as.numeric to the df
This is a strange case, but If you need to convert numeric columns based on their first element, then an idea would be to convert it to numeric. Since any element that is not a number will return NA
(as the warning states), you can use that to index. For example,
ind <- sapply(na.omit(df), function(i) !is.na(as.numeric(i[1])))
Warning message: In FUN(X[[i]], ...) : NAs introduced by coercion
ind
# color number other.number
# FALSE TRUE TRUE
df[ind] <- lapply(df[ind], as.numeric)
str(df)
#'data.frame': 3 obs. of 3 variables:
# $ color : chr "red" "blue" "yellow"
# $ number : num NA 1 3
# $ other.number: num 4 5 7
DATA
dput(df)
structure(list(color = c("red", "blue", "yellow"), number = c(NA,
"1", "3"), other.number = c("4", "5", "7")), .Names = c("color",
"number", "other.number"), row.names = c(NA, -3L), class = "data.frame")