After reading the documentation of embed()
from the stats package, I still do not understand the output. If someone could explain the example below in an alternative/simpler way (also the order of the columns) I would appreciate it. Thanks!
> x <- 1:10
> embed (x, 3)
[,1] [,2] [,3]
[1,] 3 2 1
[2,] 4 3 2
[3,] 5 4 3
[4,] 6 5 4
[5,] 7 6 5
[6,] 8 7 6
[7,] 9 8 7
[8,] 10 9 8
x <- 1:10
dimension <- 3
embed
function is written in R itself, you can view the source code of it by typing embed
in the console. The main part of the code for this case is the following
n <- length(x)
m <- n - dimension + 1L
data <- x[1L:m + rep.int(dimension:1L, rep.int(m, dimension)) - 1L]
dim(data) <- c(m, dimension)
so n
becomes 10, m is n - dimension + 1L
which is 8 here. The next line is the most important line. We generate the indexes to subset x
using rep.int
command.
1L:m #is
#[1] 1 2 3 4 5 6 7 8
We repeat m
, dimension
times which in this case is
rep.int(m, dimension)
#[1] 8 8 8
dimension:1L #is
#[1] 3 2 1
Now each value of dimension:1L
is repeated rep.int(m, dimension)
times which gives
rep.int(dimension:1L, rep.int(m, dimension))
#[1] 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1
and we subtract 1 from these numbers returning
rep.int(dimension:1L, rep.int(m, dimension)) - 1L
#[1] 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0
Now the above sequence is added to 1L:m
using recycling technique so you get
1L:m + rep.int(dimension:1L, rep.int(m, dimension)) - 1L
#[1] 3 4 5 6 7 8 9 10 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8
The above is the index to subset values from x
, since in this case, our x
is 1:10
, it would return the same values.
x[1L:m + rep.int(dimension:1L, rep.int(m, dimension)) - 1L]
#[1] 3 4 5 6 7 8 9 10 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8
Finally, all these numbers are arranged in a matrix form with m
rows and dimension
columns.
dim(data) <- c(m, dimension)
data
# [,1] [,2] [,3]
#[1,] 3 2 1
#[2,] 4 3 2
#[3,] 5 4 3
#[4,] 6 5 4
#[5,] 7 6 5
#[6,] 8 7 6
#[7,] 9 8 7
#[8,] 10 9 8