Estimate future growth using sample of historical data

I have historical records of the growth (in terms of size) of our database for past couple of years. I am trying to figure out the best way/graph that can show me the future growth of database based on the historical records, of course this won't help if we add a new table and that would grow too, but I am just looking for a way to estimate it. I am open to ideas in Python or R

Here is the size of the database in TB over years:

3.895 - 2012
6.863 - 2013
8.997 - 2014
10.626 - 2015

Solution

d <- data.frame(x= 2012:2015,
            y = c(3.895, 6.863, 8.997, 10.626))

You can visualize the fit (and its projection): here I'm comparing an additive and a polynomial model. I'm not sure I believe the confidence intervals on the additive model, though:

library("ggplot2"); theme_set(theme_bw())
ggplot(d,aes(x,y))+ geom_point() +
    expand_limits(x=2018)+
    geom_smooth(method="lm",formula=y~poly(x,2),
                fullrange=TRUE,fill="blue")+
    geom_smooth(method="gam",formula=y~s(x,k=3),colour="red",
                fullrange=TRUE,fill="red")

enter image description here

I'm a little shocked the quadratic relationship is so close.

summary(m1 <- lm(y~poly(x,2),data=d))
## Residual standard error: 0.07357 on 1 degrees of freedom
## Multiple R-squared:  0.9998, Adjusted R-squared:  0.9994 
## F-statistic:  2344 on 2 and 1 DF,  p-value: 0.0146

Predict:

predict(m1,newdata=data.frame(x=2016:2018),interval="confidence")
##        fit      lwr      upr
## 1 11.50325 8.901008 14.10549
## 2 11.72745 6.361774 17.09313
## 3 11.28215 2.192911 20.37139

Did you make up these numbers, or are they real data?

The forecast() package would be better for more sophisticated methods.