I've found this function in another post that sequentially outputs combinations of vectors when called. It is essentially a work around to expand.grid
when there are large numbers of vectors with many elements.
Here is the function:
lazyExpandGrid <- function(...) {
dots <- list(...)
argnames <- names(dots)
if (is.null(argnames)) argnames <- paste0('Var', seq_along(dots))
sizes <- lengths(dots)
indices <- cumprod(c(1L, sizes))
maxcount <- indices[ length(indices) ]
i <- 0
function(index) {
i <<- if (missing(index)) (i + 1L) else index
if (length(i) > 1L) return(do.call(rbind.data.frame, lapply(i, sys.function(0))))
if (i > maxcount || i < 1L) return(FALSE)
setNames(Map(`[[`, dots, (i - 1L) %% indices[-1L] %/% indices[-length(indices)] + 1L ),
argnames)
}
}
Here are some example calls:
set.seed(42)
nxt <- lazyExpandGrid(a=1:1e2, b=1:1e2, c=1:1e2, d=1:1e2, e=1:1e2, f=1:1e2)
as.data.frame(nxt()) # prints the 1st possible combination
nxt(sample(1e2^6, size=7)) # prints 7 sampled rows from the sample space
What I cannot figure out is how to conditionally sample using lazyExpandGrid2
. I would like to exclude samples if they have certain numbers of elements.
For example say i have these vectors for which I want to create unique combinations of: a=0:3, b=0:4, c=0:5
. I could create samples using: nxt(sample(50, size=50, replace = F))
.
But lets say I am not interested in samples where there are two 0s. How could I exclude these samples? I've tried things like: nxt(sample(which(!(sum(as.data.frame(nxt()) == 0)==2)), size=50, replace = F))
.
I just don't understand how to reference the sampled row in sample()
to be able to exclude it if it doesn't meet a certain criteria.
If you want to drop rows that don't meet a condition, I don't think you need to worry about sampling without replacement as passing the same value tonxt
should generate an identical row, which would still be dropped. It might work, then, to make a wrapper for the function as you've defined it above that just doesn't include a nxt
-generated row if it doesn't meet the condition you're after. Here, the row is dropped if the number of zeroes is equal to 2:
set.seed(0123)
nxt <- lazyExpandGrid(a = 0:3, b = 0:4, c = 0:5)
nxtDrop <- function(samp, n_row){
t(sapply(1:n_row, function(x) {
y = nxt(sample(samp, 1))
while (length(grep(0, y)) == 2) {
y = nxt(sample(samp, 1))
}
return(y)
}))
}
> nxtDrop(120, 10)
a b c
[1,] 2 3 1
[2,] 2 3 4
[3,] 1 2 2
[4,] 1 1 5
[5,] 0 3 5
[6,] 1 1 0
[7,] 3 0 3
[8,] 3 1 5
[9,] 2 1 3
[10,] 2 3 2