This is what my dataframe looks like. The rightmost column("FullCycle") is my desired column. For a given name and at a given point in time, I want to see the whole cycle of level changes for a person.
library(data.table)
dt <- fread('
Name Level Date RecentLevelChange FullCycle
John 1 2016-01-01 NA 1
John 1 2016-01-10 NA 1
John 2 2016-01-17 1->2 1->2
John 2 2016-01-18 NA 1->2
John 3 2016-01-19 2->3 1->2->3
John 4 2016-01-20 3->4 1->2->3->4
John 4 2016-01-21 NA 1->2->3->4
John 7 2016-01-22 4->7 1->2->3->4->7
Tom 1 2016-01-10 NA 1
Tom 2 2016-01-17 1->2 1->2
Tom 2 2016-01-18 NA 1->2
Tom 3 2016-01-19 2->3 1->2->3
Tom 4 2016-01-20 3->4 1->2->3->4
Tom 4 2016-01-21 NA 1->2->3->4
Tom 7 2016-01-22 4->7 1->2->3->4->7
')
I have created the field "RecentLevelChange" by trying
require(dplyr)
dt[,RecentLevelChange :=
as.character(ifelse(lag(Level)==Level ,NA,
paste(lag(Level),Level,sep="->"))),by=Name]
But I dont know how to create the "FullCycle' Column. I sincerely appreciate your help.
After grouping by 'Name', we loop through the sequence of rows (seq_len(.N)
) and paste
the unique
"Level" from first to the corresponding row.
dt[,FullCycle := vapply(seq_len(.N), function(i)
paste(unique(Level[1:i]), collapse="->"), character(1)) , by = Name]
dt
# Name Level Date RecentLevelChange FullCycle
# 1: John 1 2016-01-01 NA 1
# 2: John 1 2016-01-10 NA 1
# 3: John 2 2016-01-17 1->2 1->2
# 4: John 2 2016-01-18 NA 1->2
# 5: John 3 2016-01-19 2->3 1->2->3
# 6: John 4 2016-01-20 3->4 1->2->3->4
# 7: John 4 2016-01-21 NA 1->2->3->4
# 8: John 7 2016-01-22 4->7 1->2->3->4->7
# 9: Tom 1 2016-01-10 NA 1
#10: Tom 2 2016-01-17 1->2 1->2
#11: Tom 2 2016-01-18 NA 1->2
#12: Tom 3 2016-01-19 2->3 1->2->3
#13: Tom 4 2016-01-20 3->4 1->2->3->4
#14: Tom 4 2016-01-21 NA 1->2->3->4
#15: Tom 7 2016-01-22 4->7 1->2->3->4->7