R does not recognize my datatable as a panel, I have closing prices and totalreturn prices for a few decades, however sometimes months in between are missing, so a simple return calculation with lagged values does not work for two reasons: you don't want returns over lagged values that are not 1 month apart and it now takes returns over every company instead of having one time series per observation. My solution is this:
df1 <- df %>%
group_by(seriesid) %>%
mutate(totret <- ifelse(month(date)-month(lag(date))>1,NA,totalreturn/lag(totalreturn)-1))
names(df1) <- c("date","company","totalreturn","close", "seriesid", "ticker","totret")
df1 <- df1 %>%
group_by(seriesid) %>%
mutate(closeret <- ifelse(month(date)-month(lag(date))>1,NA,close/lag(close)-1))
names(df1) <- c("date","company","totalreturn","close", "seriesid", "ticker","totret", "closeret")
It is not fancy, but R does not allow a fancier solution because it will not recognize the new columns. My data looks like:
date company returnprice close seriesid
1 1888-01-31 x 2.500 2.500 0005
2 1888-02-04 x 2.750 2.750 0005
3 1888-04-20 x 3.350 3.350 0005
4 1895-01-30 y 7.500 4.350 0001
5 1895-02-26 y 7.800 4.650 0001
I am now able to get my data like:
date company totalreturn close seriesid totret closeret
1 1888-01-31 x 2.500 2.500 0005 NA NA
2 1888-02-04 x 2.750 2.750 0005 0.1 0.1
3 1888-04-20 x 3.350 3.350 0005 NA NA
4 1895-01-30 y 7.500 4.350 0001 NA NA
5 1895-02-26 y 7.800 4.650 0001 0.04 0.06897
df1 <- df %>%
group_by(seriesid) %>%
mutate(totret <- ifelse(month(date)-month(lag(date))>1,NA,totalreturn/lag(totalreturn)-1))
names(df1) <- c("date","company","totalreturn","close", "seriesid", "ticker","totret")
df1 <- df1 %>%
group_by(seriesid) %>%
mutate(closeret <- ifelse(month(date)-month(lag(date))>1,NA,close/lag(close)-1))
names(df1) <- c("date","company","totalreturn","close", "seriesid", "ticker","totret", "closeret")