Search code examples
rtidyquant

R: if else statement is handling column as whole vector


I have a data set where I want to calculate the 6 month return of stocks with tq_get (see example below)

Dataset called top

ticker 6month
AKO.A
BIG
BGFV

Function

library(tidyverse)
library(dplyr)
library(tidyquant)
library(riingo)

calculate <- function (x) {
  (tq_get(x, get = "tiingo", from = yesterday, to = yesterday)$adjusted/tq_get(x, get = "tiingo", from = before, to = before)$adjusted)-1
}

top[2] <- lapply(top[1], function(x) calculate(x))

Unfortunately for some of the tickers there is no value existing which results in error message when simply using lapply or mutate as the resulting vector is smaller (less rows) then the existing dataset. Resolving with try_catch did not worked.

I now wanted to apply a work around by checking with is_supported_ticker() provided by the package riingo if the ticker is available

calculate <- function (x) {
  if (is_supported_ticker(x, type = "tiingo") == TRUE) {
  (tq_get(x, get = "tiingo", from = yesterday, to = yesterday)$adjusted/tq_get(x, get = "tiingo", from = before, to = before)$adjusted)-1
  }
  else {
    NA
  }
}

top[2] <- lapply(top[1], function(x) calculate(x))

But now I receive the error message x ticker must be length 1, but is actually length 3.

I assume this is based on the fact that the whole first column of my dataset is used as input for is_supported_ticker() instead of row by row. How can I resolve this issue?


Solution

  • Glancing at the documentation, it looks like tq_get supports multiple symbols, only if_supported_ticker goes one at a time. So probably you should check all the tickers to see if they are supported, and then use tq_get once on all the supported ones. Something like this (untested, as I don't have any of these packages):

    calculate <- function (x) {
      supported = sapply(x, is_supported_ticker, type = "tiingo")
      result = rep(NA, length(x))
      result[supported] = 
        (
          tq_get(x[supported], get = "tiingo", from = yesterday, to = yesterday)$adjusted / 
          tq_get(x[supported], get = "tiingo", from = before, to = before)$adjusted
        ) - 1
      return(result)
    }
    

    It worries me that before and yesterday aren't function arguments - they're just assumed to be there in the global environment. I'd suggest passing them in as arguments to calculate(), like this:

    calculate <- function (x, before, yesterday) {
      supported = sapply(x, is_supported_ticker, type = "tiingo")
      result = rep(NA, length(x))
      result[supported] = 
        (
          tq_get(x[supported], get = "tiingo", from = yesterday, to = yesterday)$adjusted / 
          tq_get(x[supported], get = "tiingo", from = before, to = before)$adjusted
        ) - 1
      return(result)
    }
    
    # then calling it
    calculate(top$ticker, before = <...>, yesterday = <...>)
    

    This way you can pass values in for before and yesterday on the fly. If they are objects in your global environment, you can simply use calculate(top$ticker, before, yesterday), but it gives you freedom to vary those arguments without redefining those names in your global environment.