Search code examples
rdataframenumber-formatting

how to convert a column to numeric while it contains both strings and numbers as strings


I have a data frame in which there is column that I want to use to join with another data frame. The column contains number as string and strings such as follows:

x<-data.frame(referenceNumber=c("80937828","gdy","12267133","72679267","72479267"))

How Can I convert the numbers as string to numeric and replace the strings with zeros/null?

I tried x %>% mutate_if(is.character,as.numeric)

But it returns the following error :

"Error in UseMethod("tbl_vars") : 
  no applicable method for 'tbl_vars' applied to an object of class "character""

Solution

  • We could try just using as.numeric, which would assign NA to any non numeric entry in the vector. Then, we can selectively replace the NA values with zero:

    x <- c("80937828","gdy","12267133","72679267","72479267")
    output <- as.numeric(x)
    output[is.na(output)] <- 0
    output
    
    [1] 80937828        0 12267133 72679267 72479267
    

    Edit based on the comment by @Sotos: If the column/vector is actually factor, then it would have to be cast to character in order for my answer above to work.