Search code examples
rlistsplitheading

R: List with multiple headings - How to split by heading (unequal lines per heading)


I've a large file that looks like this:

Heading1
1 ABC
2 DEF
Heading2
1 GHI
2 JKL
3 MNO
Heading3
1 PQR
2 STU

The headings have always the same pattern, but the entries below each heading are different — different number of entries, no common pattern, different number of letters and/or words.

I want to split the one list into multiple lists, i.e. a new list for each heading. How can I solve this?


Solution

  • This is what I would do:

    header_positions <- grepl("^Heading", test)
    header_positions
    
    grouping_index <- cumsum(header_positions)
    grouping_index
    
    li <- split(test[!header_positions], grouping_index[!header_positions])
    li
    
    setNames(li, test[header_positions]) # if you want to have fancy names :)
    

    I think the cumsum(grepl(...)) pattern is very useful for this kind of list splitting tasking.

    If you want to write out via writeLines() you need to convert the list elements to character vectors with unlist():

    for(n in names(li)) {
      writeLines(unlist(li[[n]]), paste0(n, ".txt"))
    } 
    

    This is another helpful pattern to iterate over the names of a list, so you can access the names directly (for filenames) and use them to index the list (for the file contents).