I believe ddply ist the tool I need for my task and I'm having a bit of difficulty getting the correct results. I've read for a number of hours about ddply and have experimented with different codes, but I haven't gotten any further on my own. here is an example data frame
station <- c(rep("muc",13), rep("nbw", 17))
year <- c(rep(1994,4),rep(1995,4),rep(1996,5),rep(1994,5), rep(1995,4), rep(1996,4), rep(1997, 4))
depth <- c(rep(c("HUM","31-60","61-90","91-220"),2), rep(c("HUM","0-30", "31-60","61-90","91-220"),2),rep(c("HUM","0-30", "31-60","91-220"),1),rep(c("HUM","0-30", "31-60","61-90"),2))
doc <- c(80, 10, 3, 2,70, 15, 5, 5,70, 20, 5, 5, 2, 40, 10, 3, 2, 1,50, 15, 5, 2, 45, 20, 2, 1,35, 8, 2, 1)
df <-data.frame(station,year,depth,doc)
df
Depth refers to soil depth (HUM=Humus layer) and doc is the measured Dissolved Organic Carbon (doc) for soil depth. Note that not every year has measurments for the doc and some depth classes are missing. This is annoying but comes up often in my data set. With ddply I would like to add a column to to this data frame which so that for each depth, the doc of the above lying soil layer is returned and for the HUM NA should be given since nothing is on top of the Humus layer. as an example:
depth doc doc_m1
HUM 80 NA
31-60 10 80
61-90 3 10
91-220 2 3
In the dataframe This of course should be calculated for every year and every depth. I'd like to avoid which and for loops and it seems the ddply is suited for this, however I havn't had any luck getting a lag command to work with ddply. this is as far as I got with the code (obviously not very far):
doc <- ddply(df, .(year), transform,
doc_m1 = ????)
Does anyone have a suggestion? Thanks in advance!
If your depths are already in the right order in your data set (as they are in your example), you could just do:
doc2 <- ddply(df, .(station, year), transform,
doc_m1 = c(NA, doc[-length(doc)]))
Note I also grouped on station. This gives:
> head(doc2, 10)
station year depth doc doc_m1
1 muc 1994 HUM 80 NA
2 muc 1994 31-60 10 80
3 muc 1994 61-90 3 10
4 muc 1994 91-220 2 3
5 muc 1995 HUM 70 NA
6 muc 1995 31-60 15 70
7 muc 1995 61-90 5 15
8 muc 1995 91-220 5 5
9 muc 1996 HUM 70 NA
10 muc 1996 0-30 20 70
If they aren't already sorted by depth, make depth a factor with levels in the right order and then sort with regard to that. Then this approach should work.