I need to create different dataframes, storing different dataframes with rbind inside a for loop, one for each i.
Here an example:
library(lubridate)
DATA <- data.frame("ID"=c("01","01","02","02","03","03","03","04","04","04","05","05","05","06","06","06"),"x"=c("2009","2012","2013","2009","2012","2011","2013","2009","2010","2010","2011","2010","2009","2010","2011","2013"),"y"=c("a","a","a","b","c","a","a","c","b","b","c","a","c","a","b","c"))
one <- c("a","b","c")
two <- c("2009","2010","2011","2012")
listofdfs <- list()
for (i in one) {
for (j in 1:(length(two)-1)) {
B <- NULL
A <- DATA[DATA$x %in% c(two[j],two[j+1]) & DATA$y==i ,]
# keep the oldest date
tmp<-as.Date(A$x,"%Y-%m-%d")
if (length(tmp)>0)
{
A$tag <- 0
names(tmp)<-1:nrow(A)
id_x <- as.numeric(tapply(tmp[year(tmp)>two[j]],A$ID[year(tmp)>two[j]],function(x){
id <- which.min(x)
names(x)[id]
}))
A$tag[id_x] <- 1
}
# aggregate datasets over different years
B <- rbind(B,A)
}
listofdfs[[i]] <- B
}
It's almost want I want to obtain except for B that is not an aggregation of different A (A is overwritten each time).
I obtain a list like this:
$a
ID x y tag
2 01 2012 a 0
6 03 2011 a 0
$b
ID x y tag
15 06 2011 b 0
$c
ID x y tag
5 03 2012 c 0
11 05 2011 c 0
where 2010 is not present because of overwriting (which I don't want).
Any idea? Thanks!
Where you write
for (i in one) {
for (j in 1:(length(two)-1)) {
B <- NULL
should be
for (i in one) {
B <- NULL
for (j in 1:(length(two)-1)) {
with output
> listofdfs
$a
ID x y tag
1 01 2009 a 0
12 05 2010 a 0
14 06 2010 a 0
6 03 2011 a 0
121 05 2010 a 0
141 06 2010 a 0
2 01 2012 a 0
61 03 2011 a 0
$b
ID x y tag
4 02 2009 b 0
9 04 2010 b 0
10 04 2010 b 0
91 04 2010 b 0
101 04 2010 b 0
15 06 2011 b 0
151 06 2011 b 0
$c
ID x y tag
8 04 2009 c 0
13 05 2009 c 0
11 05 2011 c 0
5 03 2012 c 0
111 05 2011 c 0