Search code examples
rgroup-bydata.tablesummarytrend

Determine Sales trend with multiple variables like Customer ID/Item etc


I got stuck while working on trends. My problem is similar to this below thread but I have one extra variable called 'item'.

How to determine trend of time-series of values in R

My end result would be like below sample. Please help

Customer_ID Item    Sales_Slope  
Josh        milk      Positive
Josh         eggs      Negative
Eric         milk      Mixed
Eric         eggs      postive

My Data:

require("data.table")
dat <- data.table(
            customer_ID=c(rep("Josh",6),rep("Ray",7),rep("Eric",7)),
            item=c(rep("milk",3),rep("eggs",3),rep("milk",4),rep("eggs",3),rep("milk",3),rep("eggs",4)),
            sales=c(35,50,65,65,52,49,15,10,13,9,35,50,65,65,52,49,15,10,13,9))

dat[,transaction_num:=seq(1,.N), by=c("customer_ID")]

Solution

  • And the data.table approach which I outlined was:

    require(data.table)
    
    trend <- function(x) {
       ifelse(all(diff(x)>0), 'Positive',
       ifelse(all(diff(x)<0), 'Negative', 'Mixed'))
    }
    
    dat[, trend(sales), by=c("customer_ID","item")]
       customer_ID item       V1
    1:        Josh milk Positive
    2:        Josh eggs Negative
    3:         Ray milk    Mixed
    4:         Ray eggs Positive
    5:        Eric milk Negative
    6:        Eric eggs    Mixed
    
    # or if you want to assign the result...
    dat[, Sales_Slope:=trend(sales), by=c("customer_ID","item")]