Search code examples
rdplyrdata.tablereshape2

R : Finding minimum value on the basis of changing preceding and suceeding values in group by column


Adding for replicability:

  data.frame(
product=c(rep("x",2),rep("y",3)),
price_category_from=c(10,20,10,20,30),
price=c(30,31,31,30,27)

)

As shown below, I have a table which I would like to group by product and changing values of price_category_from column to find the minimum price.

product     price_category_from     price
  x                10                30
  x                20                31
  y                10                31
  y                20                30
  y                30                27

As shown below, the result table should contain minimum price.new column for changing the values in price_category_from column. For example, the price.new in both the rows for product x is 30 since the succeeding price value for price_category_from category is bigger. Whereas for product y the minimum value changes for every succeeding price_category_from categories, since next price values are smaller.

The values in price_category_from are intervals in increasing order.

product     price_category_from     price    price.new
  x                10                30        30
  x                20                31        30  **
  y                10                31        31
  y                20                30        30
  y                30                27        27

I hope that I was able to explain the problem. I would really appreciate your help (preferably with data.table). Thank you very much in advance.


Solution

  • You can use cummin to get the cumultive minimum (minimum of all values up to the given value)

    library(data.table)
    setDT(df)
    
    df[, price.new := cummin(price), by = product]
    
    df
    #    product price_category_from price price.new
    # 1:       x                  10    30        30
    # 2:       x                  20    31        30
    # 3:       y                  10    31        31
    # 4:       y                  20    30        30
    # 5:       y                  30    27        27
    

    Or with base R

    df$price.new <- with(df, ave(price, product, FUN = cummin))