Search code examples
rsequencerankdense-rank

How to collapse any gaps in sequence of integers in R?


Is there a handy base R or package function for collapsing, or interpolating, between any gaps for the integer values of a sequence of numbers? I've searched functions like dplyr::dense_rank but they don't do the trick in this case. The below code generates an example starting sequence of numbers in a single column (column = Grp):

> myDF <- data.frame(Grp = c(1,2.1,2.2,4.1,4.2,6.1,9))
> myDF   
  Grp
1 1.0    
2 2.1    
3 2.2    
4 4.1
5 4.2
6 6.1
7 9.0

Here's how I would like to change the output; below I manually add values with a column to the right of each Grp row ("Collapse") explaining what I am trying to derive:

    > myDF   
      Grp    Collapse 
    1 1.0    Every sequence starts with 1 so leave Grp as is
    2 2.1    Integer gap between rows 1-2 is <= 1 so leave Grp as is
    3 2.2    Integer gap between rows 2-3 is <= 1 so leave Grp as is
    4 3.0    Integer gap between original rows 3-4 is not <= 1 so fill in the gap with the seq integer 
    5 4.1    Integer gap between rows 4-5 is <= 1 so leave Grp as is
    6 4.2    Integer gap between rows 5-6 is <= 1 so leave Grp as is
    7 5.0    Integer gap between original rows 5-6 is not <= 1 so fill in the gap with the seq integer 
    8 6.1    Integer gap between row 7-8 is <= 1 so leave Grp as is
    9 7.0    Integer gap between original rows 6-7 is not <= 1 so fill in the gap with the seq integer
   10 8.0    Integer gap between original rows 6-7 is not <= 1 so fill in the gap with the seq integer
   11 9.0    Integer gap between row 10-11 is <= 1 so leave Grp as is

Solution

  • You can do:

    f <- floor(myDF$Grp)
    s <- seq(min(f), max(f))
    sort(c(myDF$Grp, s[!s %in% f]))
    #[1] 1.0 2.1 2.2 3.0 4.1 4.2 5.0 6.1 7.0 8.0 9.0