I need to create a function that provides me with all possible combinations of 1:n numbers. The argument of the function being n. I need to do this without using the combn function or any other pre-installed function within R.
This picture above depicts what I want to do. The bottom part is just using combn to check if the above function works.
I did the following but obviously it is not the right way currently.
pairwise_comp <- function(n) {
res <- matrix(nrow = 0, ncol = 2)
for (i in 1:n) {
res <-rbind(res,cbind( i , i+1))
}
return(res)
}
There are several ways to attack this, some efficient, some readable (subjective), not many are both.
For instance, you can do it recursively, like so:
pairwise_recur <- function(n, start = 1) {
if (n == start) return()
nrows <- factorial(n) / (factorial(2) * factorial(n-2))
res <- matrix(nrow = nrows, ncol = 2)
rbind(
cbind(rep(start, times = n - start),
1 + start:(n-1)),
pairwise_recur(n, start = start + 1)
)
}
pairwise_recur(4)
# [,1] [,2]
# [1,] 1 2
# [2,] 1 3
# [3,] 1 4
# [4,] 2 3
# [5,] 2 4
# [6,] 3 4
But several things about this are less-efficient:
rbind
iteratively.n < start
or n==0
, then it will fail.And quite possibly:
factorial
in this fashion, you can equivocate it with prod(1:n)
. The remaining functions below will use this prod
method, over to you which is preferred.factorial
and prod
will start failing with really high n
, likely well beyond the limit you are going to use for this assignment. At those numbers, it will likely be necessary to go into the gamma
realm, more-efficient calculations for high-n
factorials (and likely necessary until R is fully 64-bit-integer friendly).An iterative that fixes some of that might be
pairwise_iter <- function(n) {
nrows <- prod(1:n) / ( prod(1:2) * prod(1:(n-2)) )
res <- matrix(nrow = nrows, ncol = 2)
r <- 0
for (i in 1:(n-1)) {
for (j in (i+1):n) {
r <- r + 1
res[r,1] <- i
res[r,2] <- j
}
}
res
}
# same output
And frankly, one can get rid of the r
counter with some clever math on i
and j
.
But it is still prone to problems when n < 3
. This can be mitigated with:
pairwise_iter2 <- function(n) {
if (n <= 1) return(matrix(nrow = 0, ncol = 2))
nrows <- prod(seq_len(n)) / ( prod(1:2) * prod(seq_len(n-2)) )
res <- matrix(nrow = nrows, ncol = 2)
r <- 0
for (i in 1:(n-1)) {
for (j in (i+1):n) {
r <- r + 1
res[r,1] <- i
res[r,2] <- j
}
}
res
}
pairwise_iter2(0)
# [,1] [,2]
pairwise_iter2(1)
# [,1] [,2]
pairwise_iter2(2)
# [,1] [,2]
# [1,] 1 2
pairwise_iter2(3)
# [,1] [,2]
# [1,] 1 2
# [2,] 1 3
# [3,] 2 3
One difference (which is pre-mitigated by the leading if
/return
) is the use of seq_len
: if you want a sequence of length n
, then 1:n
is accurate only as long as n >= 1
. If n
is 0, then 1:0
produces a vector of length 2, which is not what you should get; instead seq_len(0)
returns a vector of length 0, which is more consistent.
This is still not "efficient" in the R way of doing things. For that, you can remove the inner for
loop and assign by vectors:
pairwise_vec1 <- function(n) {
if (n <= 1) return(matrix(nrow = 0, ncol = 2))
nrows <- prod(seq_len(n)) / ( prod(1:2) * prod(seq_len(n-2)) )
res <- matrix(nrow = nrows, ncol = 2)
r <- 0
for (i in 1:(n-1)) {
vec <- seq_len(n - i)
res[r + vec, 1] <- i
res[r + vec, 2] <- i + vec
r <- r + length(vec)
}
res
}
It is actually possible to generate this without even the outer for
loop, but it requires a bit more vectorized wizardry that is both outside the scope of this assignment and outside of my time to dedicate to this lesson.