Search code examples
rfunctiondplyrpipetransformation

Exclude current observation from computation in dplyr pipe


I want to exclude the current observation from the function I am applying to my data in a dplyr pipe, because I want to know the value without this observation.

To keep it simple, let's compute the mean including the current observation and excluding this observation. The solution should be applicable to other computations or function (in my case, the Gini function from DescTools package).

Let's say we look at ice prices of three different ices (ice_id = ice identifier) at three different days (day).

da <- data.frame(ice_id = c(1,1,1,2,2,2,3,3,3), day = c(1,2,3,1,2,3,1,2,3), price = c(1.60,1.90,1.80,2.10,2.05,2.30,0.50,0.40,0.35))
da
  ice_id day price
1      1   1  1.60
2      1   2  1.90
3      1   3  1.80
4      2   1  2.10
5      2   2  2.05
6      2   3  2.30
7      3   1  0.50
8      3   2  0.40
9      3   3  0.35

I want to add one column indicating the mean price of the ice including this day and one column indicating the mean price of the ice excluding this day.

da = da %>%
  group_by(ice_id) %>%
  mutate(mean_price = mean(price),
         mean_price_without = ?)

How can I add the mean price without the current observation itself?


Solution

  • For a general case to remove current observation and perform calculation, you could use map_dbl

    library(dplyr)
    library(purrr)
    da %>%
      group_by(ice_id) %>%
      mutate(mean_price = mean(price),
             mean_price_without = map_dbl(day, ~mean(price[-.x])))
             #Or
             #mean_price_without = map_dbl(day, ~mean(price[day != .x])))
             #mean_price_without = map_dbl(row_number(), ~mean(price[-.x])))
    
    
    #  ice_id   day price mean_price mean_price_without
    #   <dbl> <dbl> <dbl>      <dbl>              <dbl>
    #1      1     1  1.6       1.77               1.85 
    #2      1     2  1.9       1.77               1.7  
    #3      1     3  1.8       1.77               1.75 
    #4      2     1  2.1       2.15               2.17 
    #5      2     2  2.05      2.15               2.2  
    #6      2     3  2.3       2.15               2.08 
    #7      3     1  0.5       0.417              0.375
    #8      3     2  0.4       0.417              0.425
    #9      3     3  0.35      0.417              0.45