Search code examples
rggplot2economics

Is there a way to plot a Total Revenue Curve in R using a price elasticity measure?


I have been at it all day. I'm pretty below average when it comes to R, so I'm not quite at the level of writing my own scripts/functions yet (but trying.. and failing)

I have data of annual sales, revenue, price of a good, median incomes and population.

Using glm, I eliminated some of the variables (autocorrelation, insignificance etc.).

Here is a dput of the dataframe I'm using:

dput(df)
structure(list(Year = 2008:2018, Sales = c(50681L, 53016L, 53981L, 
56204L, 55564L, 56916L, 61328L, 59686L, 59412L, 57298L, 57569L
), Population = c(9250000L, 9380000L, 9570000L, 9660000L, 9760000L, 
9850000L, 9940000L, 10040000L, 10160000L, 10270000L, 10454860L
), Income = c(52941L, 53127L, 50020L, 48816L, 47969L, 48294L, 
48385L, 48253L, 49489L, 51672L, 51752L), Price_up = c(15, 15.57, 
15.50772, 15.75584352, 16.26003051, 16.60149115, 20, 20.32, 20.34032, 
20.60474416, 21.03744379), Price = c(16.60149115, 16.26003051, 
15.75584352, 15.50772, 15.57, 15, 21.03744379, 20.60474416, 20.34032, 
20.32, 20), ad_revenue = c(1293145, 1270159.59, 1297991.2, 1362019.86, 
1330311.32, 1423933.04, 1499699.64, 1983487.176, 2034322.84, 
2010148.6, 2008107.84)), class = "data.frame", row.names = c(NA, 
11L))

#Run Models#
m1 <- glm(formula = Sales ~ Price, data = df)
m2 <- update(m1, . ~ . + Income)
m3 <- update(m2, . ~ . + ad_revenue)
m4 <- update(m3, . ~ . + Population)
library(memisc)
library(car)

#m3 is best# 
mtable(m1, m2, m3, m4) 

#No autocorrelation# 
durbinWatsonTest(m3) 

#Calculate Price Elasticity# 
PEm3 <- as.numeric(m3$coefficients['Price'] * mean(df$Price)/mean(df$Sales))

So, how I determine the optimum price? Can a curve be made using ggplot to indicate where the highest price can be charged before there is a decrease in sales?

The value of price elasticity is 1>x>-1 indicating it is relatively inelastic.

Please help me show a curve that is bell shaped indicating the highest price point before sales plummet. It would look like the one pictured here:

enter image description here

The difference being, instead of revenue on the y-axis it would be total sales.

Thanks


Solution

  • This StackOverflow answer shows how to find a local maximum of a regression curve. However, your regression models do not have local maximums, because the models are first order in all variables. For example, the model requires that Sales either always increases or always decreases as Price increases, depending on whether the regression coefficient on Price is positive or negative, respectively. The model would need to be at least second-order in Price for there to be a (modeled) Price at which (modeled) Sales are a maximum.

    Another thing to note is that the data do not show a simple relationship between Sales and Price. For example:

    library(ggplot2)
    theme_set(theme_classic())
    
    ggplot(df, aes(Price, Sales)) + 
      geom_line(colour="grey80", size=1) +
      geom_point()
    

    enter image description here

    If we plot the points in Year order, it looks like macroeconomic factors confound the Sales vs. Price relationship:

    ggplot(df, aes(Price, Sales)) + 
      geom_path(colour="grey80", size=1) +
      geom_text(aes(label=Year), colour='red') + 
      theme_classic()
    

    enter image description here

    UPDATE: To answer the question in your comment: The models in your example are linear in Price, meaning that Sales vs. Price will always be a line, rather than a curve. If you want a model in which Sales are allowed to fall and then rise (or rise and then fall) as Price increases, then Sales needs to be at least a quadratic (second order) function of Price (or some other type of model that has more flexibility than a linear function of Price).

    To illustrate, let's create linear and quadratic models of Sales vs. Price:

    Linear (first order) in Price. The model below is fitting this equation: Sales = a + b * Price, where a and b are the regression coefficients.

    m5a = glm(Sales ~ Price, data=df)
    

    Quadratic (second order) in Price. The model below is fitting this equation: Sales = a + b * Price + c * Price^2, where a, b, and c are the regression coefficients.

    m5b = glm(Sales ~ Price + I(Price^2), data=df)
    # m5b = glm(Sales ~ poly(Price, 2, raw=TRUE), data=df)  # Another way to specify the quadratic model
    

    Now let's plot the predictions of these two models along with the data. Note in the graph below that the linear-in-Price model can only go in one direction. Sales increases at a constant rate as Price increases. On the other hand, the quadratic-in-Price model is a parabola in which Sales initially fall with increasing Price and then rise with increasing Price.

    The quadratic model fits the data better, but neither model seems to make much sense economically.

    # Set up data frame for predictions
    pred.dat = data.frame(Price = seq(min(df$Price), max(df$Price), length=100))
    
    # Add predictions from the two models
    pred.dat$linear = predict(m5a, newdata=pred.dat)
    pred.dat$quadratic = predict(m5b, newdata=pred.dat)
    
    # Reshape prediction data to long format and plot
    pred.dat %>% gather(Model, Sales, -Price) %>% 
      ggplot(aes(Price, Sales)) +
        geom_point(data=df) +  # Add data points
        geom_line(aes(colour=Model))
    

    enter image description here