Search code examples
rdplyrnumeric

R: Convert all columns to numeric with mutate while maintaining character columns


I have a dataframe which stores strings and some of those strings can be interpreted as numbers, however, they are still of class character. I want to automatically convert all columns which can be interpreted as numeric to numeric. I can do this easily with mutate_if, however it produces NAs for every remaining character column. I would like to maintain the original information in those columns.

# Reproducible example
df <- data.frame(Col1 = c("647", "237", "863", "236"),
           Col2 = c("125", "623", "854", "234"),
           Col3 = c("ABC", "BCA", "DFL", "KFD"),
           Col4 = c("PWD", "CDL", "QOW", "DKC"))

df %>% mutate_if(is.character, as.numeric)

  Col1 Col2 Col3 Col4
1  647  125   NA   NA
2  237  623   NA   NA
3  863  854   NA   NA
4  236  234   NA   NA
Warning messages:
1: Problem while computing `..1 = across(, ~as.numeric(.))`.
ℹ NAs introduced by coercion 
2: Problem while computing `..1 = across(, ~as.numeric(.))`.
ℹ NAs introduced by coercion 

Desired output:

# Character strings still available
  Col1 Col2 Col3 Col4
1  647  125  ABC  PWD
2  237  623  BCA  CDL
3  863  854  DFL  QOW
4  236  234  KFD  DKC

str(df)
'data.frame':   4 obs. of  4 variables:
 $ Col1: num  647 237 863 236
 $ Col2: num  125 623 854 234
 $ Col3: chr  "ABC" "BCA" "DFL" "KFD"
 $ Col4: chr  "PWD" "CDL" "QOW" "DKC"

Solution

  • A possible solution:

    df <- type.convert(df, as.is = T)
    str(df)
    
    #> 'data.frame':    4 obs. of  4 variables:
    #>  $ Col1: int  647 237 863 236
    #>  $ Col2: int  125 623 854 234
    #>  $ Col3: chr  "ABC" "BCA" "DFL" "KFD"
    #>  $ Col4: chr  "PWD" "CDL" "QOW" "DKC"