I'm having trouble manipulating some data in R. I have a data frame containing info. relating to customer transactions. I extract the minimum date as follows,
hold <- (lapply(with(train_train, split(date,id)),min)) # minimum date
Giving me the following list:
head(hold)
#$`15994113`
#[1] "2012-03-02"
#
#$`16203579`
#[1] "2012-03-02"
#
#$`17472223`
#[1] "2012-03-22"
What I then want to do is take the date returned for each id, and merge it back to a data frame containing other relevant variables for each id. I attempted to do it as follows;
hold <- as.data.frame(unlist(hold))
hold <- as.data.frame(cbind(row.names(hold),hold[,1]))
names(hold) <- c('id', 'mindate')
transactions.temp <- merge(x = transactions.pro, y = hold, by = 'id')
However, the bind destroys the date format and I can't work out how to get a data structure of 'id' 'mindate' that will enable me to merge this onto my main dataset which looks like this;
> head(transactions.pro)
id totaltransactions totalspend meanspend
1: 100007447 1096 6644.88 6.06284671532847
2: 100017875 348 992.29 2.85140804597701
3: 100051423 646 2771.43 4.29013931888545
4: 1000714152 2370 10509.08 4.43421097046414
5: 1002116097 1233 4158.51 3.37267639902676
6: 1004404618 754 2978.15 3.94980106100796
Any advice you provide will be hugely appreciated. Thanks!
Your cbind
is implicitly converting your dates to character
because of row.names
. Use the data.frame
method for cbind
to achieve this. Essentially replace:
as.data.frame(cbind(row.names(hold),hold[,1]))
with
cbind.data.frame(row.names(hold), hold[,1])