Imagine an artificial data frame
IDtest<-c(1,1,1,1,1,1,2,2,2,3,3,3,3)
Class<-c(1,1,3,4,4,5,1,1,2,2,2,3,4)
Day<-c(0,47,76,100,150,173,0,47,76,0,47,76,100)
Area<-c(0.45,0.85,1.50,1.53,1.98,5.2,
0.36,0.58,1.2,
0.85,1.36,2.26,3.59)
df<-data.frame(cbind(IDtest, Class, Day, Area))
df
IDtest Class Day Area
1 1 1 0 0.45
2 1 1 47 0.85
3 1 3 76 1.50
4 1 4 100 1.53
5 1 4 150 1.98
6 1 5 173 5.20
7 2 1 0 0.36
8 2 1 47 0.58
9 2 2 76 1.20
10 3 2 0 0.85
11 3 2 47 1.36
12 3 3 76 2.26
13 3 4 100 3.59
I'll like to do:
1) For IDtest 1 in Class 1: step1 = 47 - 0
2) For IDtest 1 in Class 3: step1 = 76 - 47
3) For IDtest 1 in Class 4: step1 = 150 - 76
4) For IDtest 1 in Class 4: step1 = 173 - 150
up to IDtest 3.
For this a try to:
df$step1 <- NA
for (i in 1:max(df$Class)){
if(i == 1){
df$step1[Class == i] <- max(df$Day[df$Class == i]) - 0 # quite silly
}else{
df$step1[Class == i] <- max(df$Day[df$Class == i]) - max(df$Day[df$Class == i - 1]) # "Last" as the "previous" Class, not inside the same Class
}}
If my Class variable is continuous OK, but my Class changes the value 1 for 3. In this case, my code gives me -Inf values, because is necessary to use the last Class values (1) and not 2 that doesn't exist.
My desirable output is:
new.df
IDtest Class Day Area step1
1 1 1 0 0.45 47
2 1 1 47 0.85 47
3 1 3 76 1.50 29
4 1 4 100 1.53 74
5 1 4 150 1.98 74
6 1 5 173 5.20 23
You see any simple modification here?
I am not sure if this is what you are after
merge(df,
within(
aggregate(Day ~ IDtest + Class, df, max),
step1 <- ave(Day, IDtest, FUN = function(x) diff(c(0, x)))
),
by = c("IDtest", "Class"),
all = TRUE
)
which gives
IDtest Class Day.x Area Day.y step1
1 1 1 0 0.45 47 47
2 1 1 47 0.85 47 47
3 1 3 76 1.50 76 29
4 1 4 100 1.53 150 74
5 1 4 150 1.98 150 74
6 1 5 173 5.20 173 23
7 2 1 0 0.36 47 47
8 2 1 47 0.58 47 47
9 2 2 76 1.20 76 29
10 3 2 0 0.85 47 47
11 3 2 47 1.36 47 47
12 3 3 76 2.26 76 29
13 3 4 100 3.59 100 24