I started replicating the value factor used by Fama and French in r to build a portfolio strategy for my final dissertation.
I have a dataset of monthly market caps from the S&P 500 over the years. I created a loop to determine whether the variable (mkt cap) of a determined observation at a certain date is higher or lower than a certain threshold computed cross-sectionally at the same time (across all the observations of the variable mkt cap at time t). To achieve this, I thought the appropriate technique to be a for loop. In this way for each date I calculate the threshold and check the criteria. Unfortunately I am not able to store the logical during the loop. When I print the results I can see what I would like to store but when I try to store I get only the results related to the last step of the loop.
for(d in dates$date){
month <- data_tbk %>% filter(date==d)
up <- quantile(month$mktcap, 0.8, na.rm=TRUE)
low <- quantile(month$mktcap, 0.2, na.rm=TRUE)
data_tbk %>% filter(date==d) %>%
mutate(ptf=ifelse(mktcap>=up,1,ifelse(mktcap<=low,0,NA))) %>%
print
}
Another way I tried to pursue is the following but I got even less:
data_tbk$ptf <- NA
for(d in dates$date){
month <- data_tbk %>% filter(date==d)
up <- quantile(month$mktcap, 0.8, na.rm=TRUE)
low <- quantile(month$mktcap, 0.2, na.rm=TRUE)
data_tbk %>% filter(date==d) %>% filter(mktcap>=up) %>% ptf=1
filter(data_tbk, date==d) %>% filter(mktcap<=low) %>% ptf=0
}
How should I change the codes to get a column containing the logical 1 or 0 according to the criteria?
You won't need a loop.
Assuming your dataframe is data_tbk
, this code will create new variable is_higher
. 1 if mktcap
more then Q80%, 0 if less than Q20%, and NA for the rest.
library(dplyr)
data_tbk <- data_tbk %>%
mutate(is_higher = case_when(
mktcap > quantile(mktcap,0.8) ~ 1,
mktcap <= quantile(mktcap,0.2) ~ 0,
TRUE ~ NA) )
If you expect to calculate quantile per date, then add group_by
clause.
data_tbk <- data_tbk %>%
group_by(date) %>%
mutate(is_higher = case_when(
mktcap > quantile(mktcap,0.8) ~ 1,
mktcap <= quantile(mktcap,0.2) ~ 0,
TRUE ~ NA) )
PS. You need to install package dplyr
install.package("dplyr")