Search code examples
rdata.tablefinancequantitative-finance

lapply and if in data.table


I want to check the following within my data.table:

We delete the returns for which Rt or Rt-1 is greater than 300% and (1+Rt)(1+Rt-1)-1 is less than 50%.

Now I have a data.table with lots of return columns where every colum represents one company and the rows are date-specific, the returns are in Data.table ReturnsDS01, they have missing values (NA) too.

I tried to apply this via the following code:

ReturnsNEW <- ReturnsDS01[,lapply(.SD, function(x) ifelse((x > 3 || shift(x, fill = NA) > 3) && ((1+x)(1+shift(x, fill = NA))-1)) < 0.5, x = NA, x=x), .SDcols = names(ReturnsDS01)[sapply(ReturnsDS01, is.numeric)]]

my aim was to do it via an ifelse function: if x is greater than 3 or shift(x) is greater than 3 AND (1+x)*(1+shift(x))-1 is smaller than 0.5, set x = NA.

first issue: code isnt working, i get the following error

Error in FUN(X[[i]], ...) : 
  formal argument "x" matched by multiple actual arguments

second issue: I want to assign both x and shift(x) to NA if they fullfil these conditions, but I dont know how as c(x, shift(x)) = NA isnt working either.

could someone help me out a bit?

thanks in advance.


Solution

  • First issue

    There are many syntax issues, like:

    • should use "|" and "&" as oppose to "||" or "&&" when compare vectors
    • don't need to use "x=NA" or "x=x", just use "NA,x)" for the ifelse
    • multipling 2 brucket, you need a "*" between
    • brukets not match...

    Hence the code works for me would be like:

    ReturnsDS01 = data.table(a = runif(10,0,1),b = runif(10,0,1))
    numericVar = names(ReturnsDS01)[sapply(ReturnsDS01, is.numeric)]
    lagNew  <- function(x) ifelse((x > 3 | shift(x) > 3) & (1+x)*(1+shift(x))-1 < 0.5, NA, x)
    ReturnsNEW <- ReturnsDS01[,lapply(.SD, lagNew), .SDcols = numericVar]
    

    Second issue

    You may need to revise the function little bit, like:

    ReturnsDS01 = data.table(a = runif(10,0,1),b = runif(10,0,1))
    ReturnsDS01$a[3] = 4
    ReturnsDS01$a[2] = -0.9
    lagNew  <- function(x) {
      ind = which((x > 3 | shift(x) > 3) & (1+x)*(1+shift(x))-1 < 0.5)
      x[ind] = NA
      x[setdiff(ind-1,0)] = NA
      x
    }
    
    ReturnsNEW <- ReturnsDS01[,lapply(.SD, lagNew), .SDcols = numericVar]