This loop is workable for small amount of data but when it comes to huge volume of data, it took quite long for looping. So I want to find out is there any alternate way to do it so it can help to speed up the process time by using R programming:
#set correction to the transaction
mins<-45
for (i in 1:nrow(tnx)) {
if(tnx$id[i] == tnx$id[i+1]){
#check trip within 45 mins
if(tnx$diff[i]>=mins){
tnx$FIRST[i+1] <- TRUE
tnx$LAST[i] <- TRUE
}
}
else{
tnx$LAST[i]<-TRUE
}
}
Thanks in advance.
EDIT
What I am trying to do is set the true false value in first and last column by checking the diff column.
Data like:
tnx <- data.frame(
id=rep(c("A","C","D","E"),4:1),
FIRST=c(T,T,F,F,T,F,F,T,F,T),
LAST=c(T,F,F,T,F,F,T,F,T,T),
diff=c(270,15,20,-1,5,20,-1,15,-1,-1)
)
EDIT PORTION FOR @thelatemail
# id diff FIRST LAST
#1 A 270 TRUE TRUE
#2 A 15 TRUE FALSE
#3 A 20 FALSE FALSE
#4 A -1 FALSE TRUE
#5 C 5 TRUE FALSE
#6 C 20 FALSE FALSE
#7 C -1 FALSE TRUE
#8 D 15 TRUE FALSE
#9 D -1 FALSE TRUE
#10 E -1 TRUE TRUE
Something like this should work:
I reset the FIRST
and LAST
values to make it obvious in this example:
tnx$FIRST <- FALSE
tnx$LAST <- FALSE
The next two parts use ?ave
to respectively set tnx$FIRST
to TRUE
for the first row in each id
group, and tnx$LAST
to TRUE
for the last row in each id
group.
tnx$FIRST <- as.logical(
with(tnx, ave(diff,id,FUN=function(x) seq_along(x)==1) ))
tnx$LAST <- as.logical(
with(tnx, ave(diff,id,FUN=function(x) seq_along(x)==length(x))))
The final two parts then:
- set tnx$LAST
to TRUE
when tnx$diff
is >=45
.
- set tnx$FIRST
to TRUE
when the previous value for tnx$diff
is >=45
tnx$LAST[tnx$diff >= 45] <- TRUE
tnx$FIRST[c(NA,head(tnx$diff,-1)) >= 45] <- TRUE
# id diff FIRST LAST
#1 A 270 TRUE TRUE
#2 A 15 TRUE FALSE
#3 A 20 FALSE FALSE
#4 A -1 FALSE TRUE
#5 C 5 TRUE FALSE
#6 C 20 FALSE FALSE
#7 C -1 FALSE TRUE
#8 D 15 TRUE FALSE
#9 D -1 FALSE TRUE
#10 E -1 TRUE TRUE