I have an integer 18495608239531729
, it is causing me some trouble with data.table package in a sense, that when i read data from csv file which stores numbers this big, it stores them as integer64
Now i would like to filter my data.table like dt[big_integers == 18495608239531729]
which gives me a data type mismatch (comparing integer64 and double).
I figured that since 18495608239531729
is really big number, i should perhaps use the bit64
package to handle the data types.
So i did:
library(bit64)
as.integer64(18495608239531729)
> integer64
> [1] 18495608239531728
I thought integer64 should be able to work with much larger values without any issues?
So i did:
as.integer64(18495608239531729) == 18495608239531729
> [1] TRUE
At which point i was happier, but then i figured, why not try:
as.integer64(18495608239531728)
> integer64
> [1] 18495608239531728
Which lead me to trying also:
as.integer64(18495608239531728) == as.integer64(18495608239531729)
> [1] TRUE
What is the right way to handle big numbers in R without the loss of precision? Technically, in my case, the i do not do any mathematical operations with the said column, so i could treat it as character vectors (although i was worried that would take up more memory, and joins in r data.table would be slower?)
You are passing a floating point number to as.integer64
. The loss of precision is already in your input to as.integer64
:
is.double(18495608239531729)
#[1] TRUE
sprintf("%20.5f", 18495608239531729)
#[1] "18495608239531728.00000"
Pass a character string to avoid that:
library(bit64)
as.integer64("18495608239531729")
#integer64
#[1] 18495608239531729