Search code examples
rrstudiosapplyread.csv

Modify a csv column data type from character to numeric to apply range function


The contents of csv file is given below: Data set with details of an automobile

Here the column horsepower is character by default. When I applied range function in horsepower as :

    sapply(Auto[,4],range)

The following error message appers:

    Error in Summary.factor(17L, na.rm = FALSE) : 

‘range’ not meaningful for factors

So I tried to covert the character to numeric as:

   as.numeric(as.character(Auto$horsepower))

This results in the warning message:

   NAs introduced by coercion 

After the above step also I am not able to apply the range function. How to use range function in horsepower column ? Please note that data set contains a character '?' in horsepower column line number 127.


Solution

  • The underlying issue here is that horsepower was converted to a factor when the CSV file was read into R. This is due to the presence of the ? character.

    You can avoid this using e.g.

    Auto <- read.csv("myfile.csv", 
                     stringsAsFactors = FALSE, 
                     na.strings = "?")