Search code examples
rbit64

How to convert a data frame of integer64 values to be a matrix?


I have a data frame consisting entirely of integer64 columns that I'd like to convert to be a matrix.

library(bit64)
(dfr <- data.frame(x = as.integer64(10^(9:18))))
##                      x
## 1           1000000000
## 2          10000000000
## 3         100000000000
## 4        1000000000000
## 5       10000000000000
## 6      100000000000000
## 7     1000000000000000
## 8    10000000000000000
## 9   100000000000000000
## 10 1000000000000000000

Unfortunately, as.matrix doesn't give the correct answer.

(m <- as.matrix(dfr))
##                   x
##  [1,] 4.940656e-315
##  [2,] 4.940656e-314
##  [3,] 4.940656e-313
##  [4,] 4.940656e-312
##  [5,] 4.940656e-311
##  [6,] 4.940656e-310
##  [7,] 4.940656e-309
##  [8,] 5.431165e-308
##  [9,] 5.620396e-302
## [10,] 7.832953e-242

The problem seems to be that integer64 values are stored as numeric values with an "integer64" class attribute (plus some magic to make them print and do arithmetic correctly) that gets stripped by as.matrix.

I can't just do class(m) <- "integer64" because that changes the class of the matrix object not its contents.

Likewise, mode(m) <- "integer64" gives the wrong answer and typeof(m) <- "integer64" and storage.mode(m) <- "integer64" throw errors.

Of course I could just circumvent the problem by converting the columns to double (dfr$x <- as.double(dfr$x)) but it feels like there ought to be a way to do this properly.

How can I get a matrix of integer64 values?


Solution

  • For a raw vector, assigning the dim attribute directly seems to work:

    > z <- as.integer64(1:10)
    > z
    integer64
     [1] 1  2  3  4  5  6  7  8  9  10
    > dim(z) <- c(10, 1)
    > z
    integer64
          [,1]
     [1,] 1   
     [2,] 2   
     [3,] 3   
     [4,] 4   
     [5,] 5   
     [6,] 6   
     [7,] 7   
     [8,] 8   
     [9,] 9   
    [10,] 10  
    

    For a data frame, cbinding the columns also works:

    > df <- data.frame(x=as.integer64(1:5), y=as.integer64(6:10))
    > df
      x  y
    1 1  6
    2 2  7
    3 3  8
    4 4  9
    5 5 10
    > cbind(df$x, df$y)
    integer64
         [,1] [,2]
    [1,] 1    6   
    [2,] 2    7   
    [3,] 3    8   
    [4,] 4    9   
    [5,] 5    10  
    

    So, for an arbitrary number of columns, do.call is the way to go:

    > do.call(cbind, df)
    integer64
         x y 
    [1,] 1 6 
    [2,] 2 7 
    [3,] 3 8 
    [4,] 4 9 
    [5,] 5 10