Given such a vector:
c("node 1",
"primary",
"sports, improve",
"music, improve",
"painting, improve",
"surrogate",
"music",
"node 2",
"primary",
"music, improve",
"painting, improve",
"node 3",
"primary",
"sports, improve")
I want to get each name under each "primary" and its corresponding node as a single string. For example: for the first node, which is the first element in the vector above ("node 1"), there should be three outputs: "node 1 sports", "node 1 music", "node 1 painting". For "node 2" there should be two : "node 2 music", "node 2 painting". The data is much bigger than the given vector, so indexing and manually generating strings is not preferred. My initial thought is to find each element that contains "improve" with grepl. I can't find a way to assign the elements found with grepl to its corresponding node.
Create a group based on the occurrence of 'node', get the cumsum
of logical vector, split
the vector 'v1' into a list
, paste
the first element with the substring of elements that have 'improve' and stack
it to a two column data.frame
stack(lapply(split(v1, cumsum(grepl('node', v1))),
function(x) paste(x[1], sub(",.*", "", x[grep('improve', x)]))))[2:1]
-output
# ind values
#1 1 node 1 sports
#2 1 node 1 music
#3 1 node 1 painting
#4 2 node 2 music
#5 2 node 2 painting
#6 3 node 3 sports