Search code examples
rstrsplit

restructuring a Data.Frame Into multiple rows based on strsplit


I have data structured like this.

    structure(list(id = c("4031", "1040;2040;3040", "4040", 
    "1050;2050;3050"), description = c("Sentence A", 
    "Sentence B", "Sentence C", 
    "Sentence D")), row.names = 1:4, class = "data.frame")

              id description
1           4031  Sentence A
2 1040;2040;3040  Sentence B
3           4040  Sentence C
4 1050;2050;3050  Sentence D

I would like to restructure the data so that the ids with ";" are split into separate rows - I would like this:

structure(list(id = c("4031", "1040","2040","3040", "4040", 
"1050","2050","3050"), description = c("Sentence A", 
"Sentence B","Sentence B","Sentence B", "Sentence C", 
"Sentence D","Sentence D","Sentence D")), row.names = 1:8, class = "data.frame")

   id description
1 4031  Sentence A
2 1040  Sentence B
3 2040  Sentence B
4 3040  Sentence B
5 4040  Sentence C
6 1050  Sentence D
7 2050  Sentence D
8 3050  Sentence D

I know I can split the id column with strsplit but can't sort out an efficient way to convert that to rows without a loop

strsplit( as.character( a$id ) , ";" )

Solution

  • Using R base:

    > IDs <- strsplit(df$id, ";")
    > data.frame(ID=unlist(IDs), Description=rep(df$description, lengths(IDs)))
        ID Description
    1 4031  Sentence A
    2 1040  Sentence B
    3 2040  Sentence B
    4 3040  Sentence B
    5 4040  Sentence C
    6 1050  Sentence D
    7 2050  Sentence D
    8 3050  Sentence D