Search code examples
rdataframevariablesrowsreplicate

Transform dataframe by repeating rows and create a variable counting values of two variables


This is a little subset of the data :

I have :

df 

ID numberPOS numberNEG
 1         2         3
 2         5         4
 3         1         2

and my wish is to transform dataframe with a new variable statut counting the number of times negative and positive and repeat rows for each ID like this :

df
ID numberPOS numberNEG statut
1          2         3    POS
1          2         3    POS
1          2         3    NEG
1          2         3    NEG
1          2         3    NEG
2          5         4    POS
2          5         4    POS
2          5         4    POS
2          5         4    POS
2          5         4    POS
2          5         4    NEG
2          5         4    NEG
2          5         4    NEG
2          5         4    NEG
3          1         2    POS
3          1         2    NEG
3          1         2    NEG

So the first row is repeated 5 times because numberPOS + numberNEG = 2 + 3 = 5. And i would like to create the variable statut for each row 2 times POS and 3 times NEG. Anyone see the issue? Help would be greatly appreciated. Thank you


Solution

  • Using only the base package, a solution could be this:

    df <- data.frame(ID=c(1,2,3),numberPOS=c(2,5,1),numberNEG=c(3,4,2))
    
    do.call("rbind",lapply(df$ID, function(id) {
      fittingRowIndex <- df$ID==id
      fittingRow <- df[fittingRowIndex,]
      newDf <- fittingRow[rep(1,fittingRow$numberPOS+fittingRow$numberNEG),]
      newDf$statut <- rep(c("POS","NEG"),times=c(fittingRow$numberPOS,fittingRow$numberNEG))
      newDf
    }))