Search code examples
rcsvformatxlsx

Keeping leading 0s when at using read.csv()


I am trying to create a tool that reads multiple CSVs from a folder and converts them into xlsx. My problem is that in some variables there are leading zeros that I want to keep. But the variable names vary between files and also every time I will need this tool.

So, is there a way to automatically detect leading 0s in any variables when at reading a file with read.csv()?

I cannot apply formats after reading because I will not fully know the variable names in which I need to apply this. I cannot force every column to turn into text because I have other variables that need to be a number.


Solution

  • Define a special class, num2, and then run read.csv with that.

    setClass("num2")
    
    setAs("character", "num2",
      function(from) {
        from2 <- type.convert(from, as.is = TRUE)
        if (is.numeric(from2) && any(grepl("^0", from))) from else from2
      })
    
    DF <- read.csv(text = Lines, colClasses = "num2")
    str(DF)
    ## 'data.frame':   2 obs. of  4 variables:
    ##  $ a: int  1 2
    ##  $ b: int  2 4
    ##  $ c: chr  "03" "05"
    ##  $ d: chr  "ab" "cd"
    

    Note

    Sample data

    Lines <- "a,b,c,d
    1,2,03,ab
    2,4,05,cd"