Search code examples
rdataframeif-statementdplyrmutate

Return prior selfcalculation in mutate as an option when using a conditional


I have a one-column dataframe, and want to determine if the values are increasing (1) or decreasing (-1), and when no change is found return the last calculation done. I think the code I have should do it, but dplyr returns an error saying "object" "not found", and I presume it is because it is itself. Any thought on how can this be done?

df <- data.frame(Val = c(1:5,5,5,5:1,1,1,1,6,1,1,5:1))

df %>%
  mutate(ValDirection = ifelse(Val > lag(Val, 1), 1,
                               ifelse(Val < lag(Val, 1), -1, lag(ValDirection, 1))))

Desire results should be:

df <- data.frame(Val = c(1:5,5,5,5:1, 1,1,1,6,1,1,5:1),
                 ValDirection = c(1,1,1,1,1,1,1,1,-1,-1,-1,-1,-1,-1,-1,1,-1,-1,1,-1,-1,-1,-1))

Solution

  • The error occurs because you call ValDirection before it has been defined. You can replace lag(ValDirection, 1) with NA and use tidyr::fill() to fill in missing values with the previous value.

    library(dplyr)
    
    df %>%
      mutate(ValDirection = ifelse(Val > lag(Val, 1), 1, ifelse(Val < lag(Val, 1), -1, NA))) %>%
      tidyr::fill(ValDirection)
    

    You can also use case_when() from dplyr to replace the nested ifelse():

    df %>%
      mutate(ValDirection = case_when(Val > lag(Val, 1) ~ 1, Val < lag(Val, 1) ~ -1)) %>%
      tidyr::fill(ValDirection)
    

    An alternative idea is:

    df %>%
      mutate(ValDirection = na_if(sign(c(1, diff(Val))), 0)) %>%
      tidyr::fill(ValDirection)
    
    Output
    #    Val ValDirection
    # 1    1            1
    # 2    2            1
    # 3    3            1
    # 4    4            1
    # 5    5            1
    # 6    5            1
    # 7    5            1
    # 8    5            1
    # 9    4           -1
    # 10   3           -1
    # 11   2           -1
    # 12   1           -1