Search code examples
r

Split rows featuring a range of numbers


A data frame with a column A and columns B1, B2, … Some entries in column A are of the form "3-5". This is of course only an illustration: I have several rows of the form "integer-another integer". I would like to split such rows to expand the number range, that is to say replace such rows with rows with entries "3", "4", and "5" in column A, and replicated entries in columns B1, B2, …

Input:

df <- data.frame(
  A = c("1", "2", "3-5", "6"),
  B1 = c("a", "b", "c", "d")
)

Desired output:

    A B1
1   1  a
2   2  b
3   3  c
4   4  c
4   5  c
4   6  d

It obviously call for using with separate_longer_delim.

df %>% 
  separate_longer_delim(A, "-")

does a step in the right direction.

  A B1
1 1  a
2 2  b
3 3  c
4 5  c
5 6  d

But then it is unclear to me how to fill in the missing rows. It seems a job for reframe. But it is totally unclear to me even where to start!


Solution

  • You could use seq and group on the other columns.

    df %>% 
       separate_longer_delim(A, "-") %>%
       reframe(A=seq(min(A), max(A)), .by = -A)
    
      B1 B2 A
    1  a  e 1
    2  b  f 2
    3  c  g 3
    4  c  g 4
    5  c  g 5
    6  d  h 6
    

    df <- data.frame(
      A = c("1", "2", "3-5", "6"),
      B1 = c("a", "b", "c", "d"),
      B2 = c("e", "f", "g", "h")
    )