I have been at it all day. I'm pretty below average when it comes to R, so I'm not quite at the level of writing my own scripts/functions yet (but trying.. and failing)
I have data of annual sales, revenue, price of a good, median incomes and population.
Using glm, I eliminated some of the variables (autocorrelation, insignificance etc.).
Here is a dput of the dataframe I'm using:
dput(df)
structure(list(Year = 2008:2018, Sales = c(50681L, 53016L, 53981L,
56204L, 55564L, 56916L, 61328L, 59686L, 59412L, 57298L, 57569L
), Population = c(9250000L, 9380000L, 9570000L, 9660000L, 9760000L,
9850000L, 9940000L, 10040000L, 10160000L, 10270000L, 10454860L
), Income = c(52941L, 53127L, 50020L, 48816L, 47969L, 48294L,
48385L, 48253L, 49489L, 51672L, 51752L), Price_up = c(15, 15.57,
15.50772, 15.75584352, 16.26003051, 16.60149115, 20, 20.32, 20.34032,
20.60474416, 21.03744379), Price = c(16.60149115, 16.26003051,
15.75584352, 15.50772, 15.57, 15, 21.03744379, 20.60474416, 20.34032,
20.32, 20), ad_revenue = c(1293145, 1270159.59, 1297991.2, 1362019.86,
1330311.32, 1423933.04, 1499699.64, 1983487.176, 2034322.84,
2010148.6, 2008107.84)), class = "data.frame", row.names = c(NA,
11L))
#Run Models#
m1 <- glm(formula = Sales ~ Price, data = df)
m2 <- update(m1, . ~ . + Income)
m3 <- update(m2, . ~ . + ad_revenue)
m4 <- update(m3, . ~ . + Population)
library(memisc)
library(car)
#m3 is best#
mtable(m1, m2, m3, m4)
#No autocorrelation#
durbinWatsonTest(m3)
#Calculate Price Elasticity#
PEm3 <- as.numeric(m3$coefficients['Price'] * mean(df$Price)/mean(df$Sales))
So, how I determine the optimum price? Can a curve be made using ggplot to indicate where the highest price can be charged before there is a decrease in sales?
The value of price elasticity is 1>x>-1 indicating it is relatively inelastic.
Please help me show a curve that is bell shaped indicating the highest price point before sales plummet. It would look like the one pictured here:
The difference being, instead of revenue on the y-axis it would be total sales.
Thanks
This StackOverflow answer shows how to find a local maximum of a regression curve. However, your regression models do not have local maximums, because the models are first order in all variables. For example, the model requires that Sales either always increases or always decreases as Price increases, depending on whether the regression coefficient on Price is positive or negative, respectively. The model would need to be at least second-order in Price for there to be a (modeled) Price at which (modeled) Sales are a maximum.
Another thing to note is that the data do not show a simple relationship between Sales and Price. For example:
library(ggplot2)
theme_set(theme_classic())
ggplot(df, aes(Price, Sales)) +
geom_line(colour="grey80", size=1) +
geom_point()
If we plot the points in Year order, it looks like macroeconomic factors confound the Sales vs. Price relationship:
ggplot(df, aes(Price, Sales)) +
geom_path(colour="grey80", size=1) +
geom_text(aes(label=Year), colour='red') +
theme_classic()
UPDATE: To answer the question in your comment: The models in your example are linear in Price
, meaning that Sales
vs. Price
will always be a line, rather than a curve. If you want a model in which Sales
are allowed to fall and then rise (or rise and then fall) as Price
increases, then Sales
needs to be at least a quadratic (second order) function of Price
(or some other type of model that has more flexibility than a linear function of Price
).
To illustrate, let's create linear and quadratic models of Sales
vs. Price
:
Linear (first order) in Price. The model below is fitting this equation: Sales = a + b * Price
, where a
and b
are the regression coefficients.
m5a = glm(Sales ~ Price, data=df)
Quadratic (second order) in Price. The model below is fitting this equation: Sales = a + b * Price + c * Price^2
, where a
, b
, and c
are the regression coefficients.
m5b = glm(Sales ~ Price + I(Price^2), data=df)
# m5b = glm(Sales ~ poly(Price, 2, raw=TRUE), data=df) # Another way to specify the quadratic model
Now let's plot the predictions of these two models along with the data. Note in the graph below that the linear-in-Price model can only go in one direction. Sales
increases at a constant rate as Price
increases. On the other hand, the quadratic-in-Price model is a parabola in which Sales
initially fall with increasing Price
and then rise with increasing Price
.
The quadratic model fits the data better, but neither model seems to make much sense economically.
# Set up data frame for predictions
pred.dat = data.frame(Price = seq(min(df$Price), max(df$Price), length=100))
# Add predictions from the two models
pred.dat$linear = predict(m5a, newdata=pred.dat)
pred.dat$quadratic = predict(m5b, newdata=pred.dat)
# Reshape prediction data to long format and plot
pred.dat %>% gather(Model, Sales, -Price) %>%
ggplot(aes(Price, Sales)) +
geom_point(data=df) + # Add data points
geom_line(aes(colour=Model))