I am facing some troubles with exporting a dataframe in R to csv as it seems to be converting my factors into numerics. Using summary() before exporting, I get the following:
JobLevel JobSatisfaction
1:1880 1:1448
2:3134 2:1343
3:1307 3:1996
4: 545 4:2327
5: 248
Then, I exported the file to CSV using the following command:
fwrite(HR, file = "Cleaned Data.csv")
However, when I imported the csv later, the categorical columns have seemingly been converted to continuous as such:
HR2 <- fread("Cleaned Data.csv", na.strings = "", stringsAsFactors = TRUE)
JobLevel JobSatisfaction
Min. :1.000 Min. :1.000
1st Qu.:1.000 1st Qu.:2.000
Median :2.000 Median :3.000
Mean :2.177 Mean :2.731
3rd Qu.:3.000 3rd Qu.:4.000
Max. :5.000 Max. :4.000
I believe gender is fine as it is a string but is there a way for me to export my factors with numeric levels such that when the csv is imported later, it would still remain as a factor.
Many thanks in advance!
CSV is a generic file format that is just Comma Separated Values. It doesn't contain any information about the classes of columns - that's up to the function that reads the CSV to decide.
To preserve class information when writing to a file the easiest way is to use an R-specific file format, like RDS (see ?readRDS
and ?saveRDS
). This works great if you only need R to read the file.
If you need other programs to be able to read/write the data too, then you'll need to keep track of the class information and, e.g., use the colClasses
argument of fread
to specify the column classes when you read in the CSV.