Search code examples
rrangedata-cleaning

Splitting range of values in one cell to multiple observations using R


Working on a slightly messy data where in one cell includes range of values and it is as showcased below :-

Code           Flag
69660-69663      1
69666-69667      2

The desired output is :-

Code   Flag
69660    1
69660    1
69660    1
69660    1
69666    2
69667    2

Is there a package which will handle range of values and split it into different observations?

I tried this solution :-

mydb2<-cSplit(mydb, "Code", sep = "-", direction = "long")

This just splits the value into two different observations instead of range of observations.


Solution

  • Here's a possibility -

    f <- function(x, y) {
        s <- strsplit(as.character(x), "-")[[1]]
        data.frame(Code = s[1]:s[2], Flag = y)
    }
    
    do.call(rbind, Map(f, df$Code, df$Flag))
    #    Code Flag
    # 1 69660    1
    # 2 69661    1
    # 3 69662    1
    # 4 69663    1
    # 5 69666    2
    # 6 69667    2