I have a large dataframe crsp
that contains several columns of daily stock data. Relevant for this question are the following columns (small extract to give you an idea):
PERMNO date monthyear PRC RET
10001 1990-01-02 199001 10.1250 0.0000
10001 1990-01-03 199001 10.0000 -0.0123
...
10001 1990-02-01 199002 10.0000 0.0000
10001 1990-02-02 199002 10.0625 0.0062
...
10002 1990-01-02 199001 6.1250 0.0000
10002 1990-01-03 199001 6.2000 0.0122
...
10002 1990-02-01 199002 6.2500 0.0000
10002 1990-02-02 199002 6.5000 0.0400
...
"PERMNO"
is the stock ID, "date"
the actual date, "monthyear"
indicates the month, "PRC"
is the price and "RET"
the daily return.
I am trying to add a new column "MonthlyReturn"
which shows the monthly return for each stock. The value should therefore be calculated for each month of each stock (grouped by PERMNO).
As far as my knowledge goes, there could be two possibilities to solve this:
Either way, I am aiming for the following output:
PERMNO date monthyear PRC RET MonthlyReturn
10001 1990-01-02 199001 10.1250 0.0000 0.1000
10001 1990-01-03 199001 10.0000 -0.0123 0.1000
...
10001 1990-02-01 199002 10.0000 0.0000 0.0987
10001 1990-02-02 199002 10.0625 0.0062 0.0987
...
10002 1990-01-02 199001 6.1250 0.0000 -0.0034
10002 1990-01-03 199001 6.2000 0.0122 -0.0034
...
10002 1990-02-01 199002 6.2500 0.0000 0.2340
10002 1990-02-02 199002 6.5000 0.0400 0.2340
...
Through research I came upon the monthlyReturn-function from quantmod, could this be useful?
Any help would be greatly appreciated as I just started learning R. Also feel free to add anything that could help me improve this question's suitability for SO.
Using Tidyverse, you can calculate your monthly return this way :
library(tidyverse)
library(lubridate)
df <- left_join(df, df %>%
arrange(PERMNO, date) %>% # order the data by stock id and date
filter(!wday(as.Date(date)) %in% c(1,7)) %>% # filter week end
group_by(PERMNO, monthyear) %>%
mutate(MonthlyReturn = last(PRC) / first(PRC) - 1) %>% # calculate the monthly return per stock id and month
select(PERMNO, monthyear, MonthlyReturn)) # Keep only these 3 columns
Hope this helps.