Search code examples
rsequenceseq

jumping / alternating sequence that will ordner rows in dataframe


lets say i have a ds with rows:

cat
dog
lion
miau
wuff
roarr

i want to order them in a sequence

cat
miau
dog
wuff
lion
roarr

in order to do that i need to order it with a sequence

1 4 2 5 3 6

Lets take a more general example with arbitrary n:

n <- 10

ds < data.frame(col=c(paste0(letters[1:n],1),paste0(letters[1:n],2)),stringsAsFactors = F)

ds[,] <- ds[mySeq,]

How to generate that sequence (mySeq) for all kind of n ?

> ds
   col
1   a1
2   b1
3   c1
4   d1
5   e1
6   f1
7   g1
8   h1
9   i1
10  j1
11  a2
12  b2
13  c2
14  d2
15  e2
16  f2
17  g2
18  h2
19  i2
20  j2
> 

edit: i could imagine to zip the sequences 1:(nrow(ds)/2) and (nrow(ds)/2+1):nrow(ds). So if n gets higher i need to zip alot of seqs. Not really practical.

The gtools mixedsort() wont work with "random" rows:

set.seed(1337)
MHmakeRandomString <- function(n=1, lenght=12)
{
  randomString <- c(1:n)                  # initialize vector
  for (i in 1:n)
  {
    randomString[i] <- paste(sample(c(0:9, letters, LETTERS),
                                    lenght, replace=TRUE),
                             collapse="")
  }
  return(randomString)
}

ds <- data.frame(col=c(paste0(MHmakeRandomString(n),1),paste0(MHmakeRandomString(n),2)),stringsAsFactors = F)

dso <- mixedsort(ds)

I think i do need that sequence!

i updated my first mini example!


Solution

  • here's another approach, trying to generate the numerical sequence based on it's underlying patterns. This means no string operations.

     sequence_generator  <- function(n, nrow){
      base_seq=rep(1:n,each=nrow/n)
      res=base_seq+seq(0,(nrow/n)-1)*n
      res
    }
    
    sequence_generator(3,6)
    # [1] 1 4 2 5 3 6
    sequence_generator(10,20)
    #[1]  1 11  2 12  3 13  4 14  5 15  6 16  7 17  8 18  9 19 10 20