Search code examples
rregexgrepl

Validating string vector of digits of determined length


I am trying to validate a vector of digits in a string, via grepl and regexp. So I found that grepl automatically trim leading zeros, so I am getting erroneous answers. I tried using as.character with no success.

This is my function:

isValidTest <- function(x){
  x <- as.character(x)
  grepl("^[[:digit:]]{13}$", x)
}

and my test:

> isValidTest(c(9788467850703,0759398399, 3002011502068, 0788467850703))
[1]  TRUE FALSE  TRUE FALSE

Instead with quotes:

> isValidTest(c(9788467850703,0759398399, 3002011502068, "0788467850703"))
[1]  TRUE FALSE  TRUE  TRUE

Note last item of the vector starting with 0 0788467850703 which I would like to retrieve a TRUE answer. On the other hand, why as.character is not working?


Solution

  • as.character is not working as expected since your constructed vector is coerced into numeric and loses the leading zeros then:

    > x <- c(9788467850703,0759398399, 3002011502068, 0788467850703)
    > x
    [1] 9.788468e+12 7.593984e+08 3.002012e+12 7.884679e+11
    > lapply(x,class)
    [[1]]
    [1] "numeric"
    
    [[2]]
    [1] "numeric"
    
    [[3]]
    [1] "numeric"
    
    [[4]]
    [1] "numeric"
    

    Converting the numbers back to a string does not recover the leading zeros then.

    Use a vector of strings instead (as the author of the first comment recommended).