MEMORY1 MEMORY2 MEMORY3 MEMORY4 MEMORY5 MEMORY6 MEMORY7 MEMORY8
1 2.5 2.5 4 5 6 7 8
Suppose I have the above ranking of 8 items. There are 2 ways to break this tie: 1 2 3 4 5 6 7 8
or 1 3 2 4 5 6 7 8
. I am trying to write a function which outputs these two possible sequences when given the original ranking with ties in it.
In the case of
MEMORY1 MEMORY2 MEMORY3 MEMORY4 MEMORY5 MEMORY6 MEMORY7 MEMORY8
4.5 4.5 4.5 4.5 4.5 4.5 4.5 4.5
All items are tied, so there's 8!
possible sequences. permn(8)
or something similar would enumerate the sequences just fine.
In the case of
MEMORY1 MEMORY2 MEMORY3 MEMORY4 MEMORY5 MEMORY6 MEMORY7 MEMORY8
7.5 5 5 2 7.5 2 2 5
There are 3! x 3! x 2! = 72
possible sequences. How can I write a function that outputs these 72 possible sequences given the original ranking?
myfun = function(ranking){
output = vector()
values = sort(unique(ranking))
if(length(values) < 8){
#if there are ties
for(i in 1:length(values)){
value_in_question = values[i]
if(sum(value_in_question %in% values[i] == 1)){
output = output
}else output[i] = permn(values[i])
}
}
return(output)
}
This is my attempt, it doesn't work. And I'm having trouble coming up with a way to enumerate the sequences when there are multiple ties...
EDIT:
dat = c(2.5, 2.5, 2.5, 2.5, 6.5, 6.5, 6.5, 6.5)
names(dat) <- paste0("MEMORY", 1:8)
## Group similar items, compute run lengths, then permute
library(combinat) # permn
gs <- cumsum(abs(c(0, diff(sort(dat)))) > 1e-9)
lens <- rle(gs)$lengths
lst <- mapply(function(a,b) lapply(permn(1:b), `+`, a), c(0, cumsum(head(lens, -1))), lens)
## Expand into data.frame (don't expand if all were the same)
res <- if(!is.null(dim(lst)) && dim(lst)[2] == 1) lst else expand.grid(lst)
Error: cannot allocate vector of size 16.0 Gb
In addition: Warning messages:
1: In rep.int(seq_len(nx), rep.int(rep.fac, nx)) :
Reached total allocation of 8070Mb: see help(memory.size)
2: In rep.int(seq_len(nx), rep.int(rep.fac, nx)) :
Reached total allocation of 8070Mb: see help(memory.size)
3: In rep.int(seq_len(nx), rep.int(rep.fac, nx)) :
Reached total allocation of 8070Mb: see help(memory.size)
4: In rep.int(seq_len(nx), rep.int(rep.fac, nx)) :
Reached total allocation of 8070Mb: see help(memory.size)
5: In rep.int(seq_len(nx), rep.int(rep.fac, nx)) :
Reached total allocation of 8070Mb: see help(memory.size)
6: In rep.int(seq_len(nx), rep.int(rep.fac, nx)) :
Reached total allocation of 8070Mb: see help(memory.size)
Since you are comparing floats, you don't want to use ==
tests. Instead, check that the difference between numbers is sufficiently small. Here is a possible solution that makes no claims on efficiency.
## Example
dat <- c(7.5, 5, 5, 2, 7.5, 2, 2, 5)
names(dat) <- paste0("MEMORY", 1:8)
## Group similar items, compute run lengths, then permute
library(combinat) # permn
gs <- cumsum(abs(c(0, diff(sort(dat)))) > 1e-9)
lens <- rle(gs)$lengths
lst <- mapply(function(a,b) lapply(permn(1:b), `+`, a),
c(0, cumsum(head(lens, -1))), lens, SIMPLIFY = FALSE)
## Expand into data.frame (don't expand if all were the same)
res <- if(!is.null(dim(lst)) && dim(lst)[2] == 1) lst else expand.grid(lst)
## Unnest columns if desired
res <- data.frame(t(apply(res, 1, unlist)))
## Name the columns
names(res) <- names(sort(dat))
head(res)
# MEMORY4 MEMORY6 MEMORY7 MEMORY2 MEMORY3 MEMORY8 MEMORY1 MEMORY5
# 1 1 2 3 4 5 6 7 8
# 2 1 3 2 4 5 6 7 8
# 3 3 1 2 4 5 6 7 8
# 4 3 2 1 4 5 6 7 8
# 5 2 3 1 4 5 6 7 8
# 6 2 1 3 4 5 6 7 8
## Gets all 72 sequences from example: 3!*3!*2!
nrow(res)
# [1] 72
The result should be a data.frame where each row is one of the possible sequences (the sequences being indexes of the sorted data).