Search code examples
rdatasetstackexchange

Understanding R syntax in this example


Finding Peak in a dataset - using R

Hi

I saw this thread on stackexchange. I am not an R programmer (yet). But I would like to implement in the C. But being not familiar with R syntax, I am not able to understand the code. I know its creating arrays such as y.max and i.max but I am not sure the operations done and how its manipulating the arrays. Here are the four lines I am particularly interested in.

  y.max <- rollapply(zoo(y.smooth), 2*w+1, max, align="center")
  delta <- y.max - y.smooth[-c(1:w, n+1-1:w)]
  i.max <- which(delta <= 0) + w
  list(x=x[i.max], i=i.max, y.hat=y.smooth)

Some pointers to understanding these particular syntax will be helpful.


Solution

  • Here is a translation of that code. R often uses nested function calls that can be hard to understand if you don't know what each function does. To help with this, I separated some lines into multiple lines and stored the results in new variables.

    # convert y.smooth to a zoo (time series) object
    zoo_y.smooth <- zoo(y.smooth)
    
    # divide the data into rolling windows of width 2*w+1
    # get the max of each window
    # align = "center" makes the indices of y.max be aligned to the center
    # of the windows
    y.max <- rollapply(zoo_y.smooth, 
                       width = 2*w+1, 
                       FUN = max, 
                       align="center")
    

    R subsetting can be very terse. c(1:w, n+1-1:w) creates a vector of numbers called toExclude. Passing that vector with the - to the subsetting operator [] selects all element of y.smooth except for those at the indices specified in toExclude. Omitting the - would do the opposite.

    # select all of the elements of y.smooth except 1 to w and n+1-1 to w
    toExclude <- c(1:w, n+1-1:w)
    subset_y.smooth <- y.smooth[-toExclude]
    
    # element-wise subtraction
    delta <- y.max - subset_y.smooth
    
    # logical vector the same length of delta indicating which elements
    # are less than or equal to 0
    nonPositiveDelta <- delta <= 0
    

    So nonPositiveDelta is a vector like TRUE FALSE FALSE... with an element for each element of delta, indicating which elements of delta are non-positive.

    # vector containing the index of each element of delta that's <= 0
    indicesOfNonPositiveDeltas <- which(nonPositiveDelta)
    

    indicesOfNonPositiveDeltas, on the other hand, is a vector like 1, 3, 4, 5, 8 containing the index of every element of the previous vector that was TRUE.

    # indices plus w
    i.max <- indicesOfNonPositiveDeltas + w
    

    Finally, the results are stored in a list. A list is sort of like an array of arrays, where each element of the list can itself be another list or any other type. In this case, each element of the list is a vector.

    # create a three element list 
    # each element is named, with the name to the left of the equal sign
    list(
      x=x[i.max], # the elements of x at indices specified by i.max
      i=i.max, # the indices of i.max
      y.hat=y.smooth) # the y.smooth data
    

    Without seeing the rest of the code or a description of what it's supposed to be doing, I had to guess a bit, but hopefully this helps you out.