I want to create several columns with a ifelse()-condition for multiple dataframes. In this case the dataframes are 3 time-series data for cryptocurrencies. Here is the code to download the 3 dataframes automatically:
library(tidyverse)
library(crypto)
crypto_chart <- crypto_prices()%>% select(-id, -symbol,-price_btc, -`24h_volume_usd`,-available_supply, -total_supply,-max_supply, -percent_change_1h, -percent_change_24h, -percent_change_7d, -last_updated)%>% slice(1:3)
list_cryptocurrencies <-crypto_chart$name
map(list_cryptocurrencies,
function(x) crypto_history(x, start_date = '20150101', end_date = '20190303')%>%
select(-slug, -symbol, -name, -`ranknow`))%>%
set_names(list_cryptocurrencies)%>%
list2env(envir = .GlobalEnv)
##Calculating return
map(mget(list_cryptocurrencies),
function(x) x %>% mutate(`return` = (close-open)/open * 100))%>%
list2env(mget(list_cryptocurrencies), envir = .GlobalEnv)
Now I want to detect positive overreactions (oR_pos) in the returns. I define an overreaction as a value (return) higher than the mean + 1 standard deviation. I want do this also for 1.5 and 2 standard deviations. Here ist my desired output for one cryptocurrencie (Bitcoin):
> Bitcoin
date open close return oR_pos>1sd oR_pos>1.5sd oR_pos>2sd
1 2018-01-01 14112.2 13657.2 -3.2241607 NA NA NA
2 2018-01-02 13625.0 14982.1 9.9603670 9.960367 9.960367 9.960367
3 2018-01-03 14978.2 15201.0 1.4874952 NA NA NA
4 2018-01-04 15270.7 15599.2 2.1511784 NA NA NA
5 2018-01-05 15477.2 17429.5 12.6140387 12.614039 12.614039 12.614039
6 2018-01-06 17462.1 17527.0 0.3716621 NA NA NA
7 2018-01-07 17527.3 16477.6 -5.9889430 NA NA NA
8 2018-01-08 16476.2 15170.1 -7.9271919 NA NA NA
9 2018-01-09 15123.7 14595.4 -3.4931928 NA NA NA
10 2018-01-10 14588.5 14973.3 2.6376941 NA NA NA
11 2018-01-11 14968.2 13405.8 -10.4381288 NA NA NA
12 2018-01-12 13453.9 13980.6 3.9148500 3.914850 NA NA
Now I have 3 new columns with overreactions(oR_pos) which are > 1sd; 1.5sd and 2sd.
I've already tried this code:
oR_pos_function <- function(y) {
n <- seq(1, 2, 0.5)
y[paste0("oR_pos>", n, "sd")] <-lapply(n, function(x)
ifelse(x$return > mean(x$return)+ sd(x$return),x$return, NA))
y
}
map(mget(list_cryptocurrencies), oR_pos_function)%>%
set_names(list_cryptocurrencies)%>%
list2env(envir = .GlobalEnv)
But it doesen't works. Can someone help me?
The following closely matches your intended function, adding the desired columns onto your crypto, while allowing the desired sd thresholds to be passed-in as parameter for flexibility. Aside note, the solution below uses >
as per OP, but you may wish to consider movement +/- direction from sd. Using solution below could be done using instead:
col <- ifelse(returns > (r_mean+(r_sd*threshold)) |
returns < (r_mean-(r_sd*threshold)),
returns,NA)
Solution as follows:
oR_pos_function <- function(returns,thresholds) {
r_mean <- mean(returns,na.rm=T)
r_sd <- sd(returns,na.rm=T)
cols <- lapply(thresholds,function(threshold) {
col <- ifelse(returns > (r_mean+(r_sd*threshold)),returns,NA)
return(col)
})
cols <- as.data.frame(cols)
names(cols) <- paste0("oR_pos>",thresholds,"sd")
return(cols)
}
new_cols <- oR_pos_function(returns=Bitcoin$return,thresholds=c(1,1.5,2))
Bitcoin <- cbind(Bitcoin,new_cols)
Results:
> head(Bitcoin[Bitcoin$date>="2018-01-01",])
date open high low close volume market close_ratio spread return oR_pos>1sd oR_pos>1.5sd oR_pos>2sd
1097 2018-01-01 14112.2 14112.2 13154.7 13657.2 10291200000 229119155396 0.5248042 957.5 -3.2241607 NA NA NA
1098 2018-01-02 13625.0 15444.6 13163.6 14982.1 16846600192 251377913955 0.7972381 2281.0 9.9603670 9.960367 9.960367 9.960367
1099 2018-01-03 14978.2 15572.8 14844.5 15201.0 16871900160 255080562912 0.4894961 728.3 1.4874952 NA NA NA
1100 2018-01-04 15270.7 15739.7 14522.2 15599.2 21783199744 261795321110 0.8845996 1217.5 2.1511784 NA NA NA
1101 2018-01-05 15477.2 17705.2 15202.8 17429.5 23840899072 292544135538 0.8898258 2502.4 12.6140387 12.614039 12.614039 12.614039
1102 2018-01-06 17462.1 17712.4 16764.6 17527.0 18314600448 294217423675 0.8043891 947.8 0.3716621 NA NA NA
>
Alternative per comments:
oR_pos_function <- function(coin_data,thresholds) {
returns <- coin_data$return
r_mean <- mean(returns,na.rm=T)
r_sd <- sd(returns,na.rm=T)
cols <- lapply(thresholds,function(threshold) {
col <- ifelse(returns > (r_mean+(r_sd*threshold)),returns,NA)
return(col)
})
cols <- as.data.frame(cols)
names(cols) <- paste0("oR_pos>",thresholds,"sd")
coin_data <- cbind(coin_data,cols)
return(coin_data)
}