I have some data that looks like this:
dat <- c("Sales","Jim","Halpert","","",
"Reception","Pam","Beasley","","",
"Not.Manager","Dwight","Schrute","Bears","Beets","BattlestarGalactica","","",
"Manager","Michael","Scott","","")
Each "chunk" of data is consecutive with some blanks in between. I want to transform the data into a list of lists that looks like this:
iwant <- c(
c("Sales","Jim","Halpert"),
c("Reception","Pam","Beasley"),
c("Not.Manager","Dwight","Schrute","Bears","Beets","BattlestarGalactica"),
c("Manager","Michael","Scott")
)
Suggestions? I am using rvest and stringi. I do not want to add more packages.
You can use rle
, split
with lapply
:
lapply(split(dat, with(rle(dat != ''),
rep(cumsum(values), lengths))), function(x) x[x!= ''])
#$`1`
#[1] "Sales" "Jim" "Halpert"
#$`2`
#[1] "Reception" "Pam" "Beasley"
#$`3`
#[1] "Not.Manager" "Dwight" "Schrute" "Bears" "Beets"
#[6] "BattlestarGalactica"
#$`4`
#[1] "Manager" "Michael" "Scott"
rle
part creates group to split
on :
with(rle(dat != ''), rep(cumsum(values), lengths))
#[1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 4
After split
we use lapply
to remove any empty elements from each list.