Search code examples
rmemorygmpbigintmpfr

Using big integers with gmp and machine limits


I wonder if it is possible to use integers bigger than the value of .Machine$double.xmax (~1.79e308) in R. I thought that by using e.g. Rmpfr or gmp libraries in R you could assign values of any size, up to the limit of RAM on your system? I thought this was greater than .Machine$double.xmax but clearly it isn't.

> require( gmp )
> as.bigz( .Machine$double.xmax )
Big Integer ('bigz') :
[1] 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368
> as.bigz( 1e309 )
Big Integer ('bigz') :
[1] NA
> 

Is it possible for someone explain why a computer using 64bit memory addressing can't store values greater than 1.79e308? Sorry - I don't have a computer science background, but I am trying to learn.

Thanks.


Solution

  • Rmpfr can do the string conversion using mpfr_set_str ...

    val <- mpfr("1e309")
    
    ## 1 'mpfr' number of precision  17   bits 
    ## [1] 9.999997e308
    
    # set a precision (assume base 10)...
    est_prec <- function(e) floor( e/log10(2) ) + 1
    
    val <- mpfr("1e309", est_prec(309) )
    
    ## 1 'mpfr' number of precision  1027   bits 
    ## [1]1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
    
    .mpfr2bigz(val)
    
    ## Big Integer ('bigz') :
    ## [1] 1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
    
    # extract exponent from a scientific notation string
    get_exp <- function( sci ) as.numeric( gsub("^.*e",'', sci) )
    
    # Put it together
    sci2bigz <- function( str ) {
      .mpfr2bigz( mpfr( str, est_prec( get_exp( str ) ) ) )
    }
    
    val <- sci2bigz( paste0( format( Const("pi", 1027) ), "e309") )
    
    identical( val, .mpfr2bigz( Const("pi",1027)*mpfr(10,1027)^309 ) )
    
    ## [1] TRUE
    
    ## Big Integer ('bigz') :
    ## [1] 3141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825342117067982148086513282306647093844609550582231725359408128481117450284102701938521105559644622948954930381964428810975665933446128475648233786783165271201909145648566923460348610454326648213393607260249141273724587004
    

    As for the why on storing a number larger than .Machine$double.xmax, the documentation on floating point encoding in the IEEE specs, the R FAQs and wikipedia go into all of the jargon, but I find it helpful to just define the terms (using ?'.Machine')...

    double.xmax(largest normalized floating-point number) =
    (1 - double.neg.eps) * double.base ^ double.max.exp where

    1. double.neg.eps(a small positive floating-point number x such that 1 - x != 1) =double.base ^ double.neg.ulp.digits where
      • double.neg.ulp.digits = the largest negative integer such that 1 - double.base ^ i != 1 and
    2. double.max.exp = the smallest positive power of double.base that overflows and
    3. double.base(the radix for the floating-point representation) = 2 (for binary).

    Thinking in terms of what finite floating point number can be distinguished from another; the IEEE specs tell us that for a binary64 number 11 bits get used for the exponent, so we have a maximum exponent of 2^(11-1)-1=1023 but we want the maximum exponent that overflows so double.max.exp is 1024.

    # Maximum number of representations
    # double.base ^ double.max.exp
    base <- mpfr(2, 2048)
    max.exp <- mpfr( 1024, 2048 )
    
    # This is where the big part of the 1.79... comes from
    base^max.exp
    
    ## 1 'mpfr' number of precision  2048   bits 
    ## [1] 179769313486231590772930519078902473361797697894230657273430081157732675805500963132708477322407536021120113879871393357658789768814416622492847430639474124377767893424865485276302219601246094119453082952085005768838150682342462881473913110540827237163350510684586298239947245938479716304835356329624224137216
    
    # Smallest definitive unit.
    # Find the largest negative integer...
    neg.ulp.digits <- -64; while( ( 1 - 2^neg.ulp.digits ) == 1 ) 
      neg.ulp.digits <<- neg.ulp.digits + 1
    
    neg.ulp.digits
    
    ## [1] -53
    
    # It makes a real small number...
    neg.eps <- base^neg.ulp.digits
    
    neg.eps
    
    ## 1 'mpfr' number of precision  2048   bits 
    ## [1] 1.11022302462515654042363166809082031250000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000e-16
    
    # Largest difinitive floating point number less than 1
    # times the number of representations
    xmax <- (1-neg.eps) * base^max.exp
    
    xmax
    
    ## 1 'mpfr' number of precision  2048   bits 
    ## [1] 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368
    
    identical( asNumeric(xmax), .Machine$double.xmax )
    
    ## [1] TRUE