Search code examples
rcsvdata-importreadr

how to specify the digits of numeric values when reading data with read.csv, read_csv or read_excel in R


I am trying to read Geographic latitude and longitude into R. These geographic data are usually numeric values with over 6 digits. I was trying to read excel file with read_excel() in "read_excel" package, and read.csv in base R, and read_csv() in "readr" package. However, none of the aforementioned functions can correctly read these data without loss of information. All of these functions, without exception, could only read numeric values truncated at 4 or 5 digits. I also tried to use "options(digits = 8)" to specify the default digit before reading the data, but it does not work. Here I have made a reproducible example for the read_csv() function in "readr" package:

read_csv("112.8397456,35.50496106\n112.583984,37.8519194\n112.5826569,37.8602818", col_names = FALSE)

The system automatically truncates the data at 5 digits:

# A tibble: 3 × 2
        X1       X2
     <dbl>    <dbl>
1 112.8397 35.50496
2 112.5840 37.85192
3 112.5827 37.86028

I have checked on stackoverflow, and it seems that no similar questions have been brought up. Could any one give me a feasible answer on how to read this form of data with information loss? Thanks. :)


Solution

  • This isn't an issue with readr. The full data is still in there—R is just not showing it all. The same thing happens when you use base R's read.csv():

    library(tidyverse)
    df.readr <- read_csv("112.8397456,35.50496106\n112.583984,37.8519194\n112.5826569,37.8602818", col_names = FALSE)
    
    df.base <- read.csv(textConnection("112.8397456,35.50496106\n112.583984,37.8519194\n112.5826569,37.8602818"), header = FALSE)
    
    # By default R shows 7 digits
    getOption("digits")
    #> [1] 7
    
    # Both CSV files are truncated at 7 digits
    df.readr
    #> # A tibble: 3 × 2
    #>         X1       X2
    #>      <dbl>    <dbl>
    #> 1 112.8397 35.50496
    #> 2 112.5840 37.85192
    #> 3 112.5827 37.86028
    df.base
    #>         V1       V2
    #> 1 112.8397 35.50496
    #> 2 112.5840 37.85192
    #> 3 112.5827 37.86028
    
    # Bumping up the digits shows more
    options("digits" = 15)
    
    df.readr
    #> # A tibble: 3 × 2
    #>            X1          X2
    #>         <dbl>       <dbl>
    #> 1 112.8397456 35.50496106
    #> 2 112.5839840 37.85191940
    #> 3 112.5826569 37.86028180
    df.base
    #>            V1          V2
    #> 1 112.8397456 35.50496106
    #> 2 112.5839840 37.85191940
    #> 3 112.5826569 37.86028180