How to EXTRAPOLATE missing data with R in panel data?

In a Panel Data, I have some NA values that I would like to extrapolate at the end or start of my years of interest. Most solutions to similar questions deal with interpolation. Please note that this is not the case.

A similar sample of my data called "data" looks like this:

> data
   REGION YEAR  VALUE
1       A 2011     NA
2       A 2012     NA
3       A 2013     NA
4       A 2014  20.00
5       A 2015  25.00
6       A 2016  30.00
7       A 2017  35.00
8       A 2018  40.00
9       B 2011     NA
10      B 2012   0.30
11      B 2013   0.50
12      B 2014   0.70
13      B 2015   0.90
14      B 2016   0.11
15      B 2017   0.13
16      B 2018   0.15
17      C 2011 100.00
18      C 2012 101.00
19      C 2013 102.00
20      C 2014 103.00
21      C 2015 104.00
22      C 2016 105.00
23      C 2017 106.00
24      C 2018     NA

Some solutions that I found to similar questions:

I have try with na_interpolate but it just repeats the last values. I also tried with mutate combined with na.spline, but it creates values that are obviously wrong for a linear regression

I am sure that there must be a simple way to estimate these values.

The expected result should look like this:

> data
   REGION YEAR  VALUE
1       A 2011   5.00
2       A 2012  10.00
3       A 2013  15.00
4       A 2014  20.00
5       A 2015  25.00
6       A 2016  30.00
7       A 2017  35.00
8       A 2018  40.00
9       B 2011   0.10
10      B 2012   0.30
11      B 2013   0.50
12      B 2014   0.70
13      B 2015   0.90
14      B 2016   0.11
15      B 2017   0.13
16      B 2018   0.15
17      C 2011 100.00
18      C 2012 101.00
19      C 2013 102.00
20      C 2014 103.00
21      C 2015 104.00
22      C 2016 105.00
23      C 2017 106.00
24      C 2018 107.00

Thank you for your help

Solution

You can just do:

predictions <- round(predict(lm(VALUE ~ REGION * YEAR, df), newdata = df), 2)
predictions
#>      1      2      3      4      5      6      7      8      9     10 
#>   5.00  10.00  15.00  20.00  25.00  30.00  35.00  40.00   0.65   0.59 
#>     11     12     13     14     15     16     17     18     19     20 
#>   0.53   0.46   0.40   0.34   0.27   0.21 100.00 101.00 102.00 103.00 
#>     21     22     23     24 
#> 104.00 105.00 106.00 107.00 

df$VALUE[is.na(df$VALUE)] <- predictions[is.na(df$VALUE)]

Which gives the desired result:

df
#>    REGION YEAR  VALUE
#> 1       A 2011   5.00
#> 2       A 2012  10.00
#> 3       A 2013  15.00
#> 4       A 2014  20.00
#> 5       A 2015  25.00
#> 6       A 2016  30.00
#> 7       A 2017  35.00
#> 8       A 2018  40.00
#> 9       B 2011   0.65
#> 10      B 2012   0.30
#> 11      B 2013   0.50
#> 12      B 2014   0.70
#> 13      B 2015   0.90
#> 14      B 2016   0.11
#> 15      B 2017   0.13
#> 16      B 2018   0.15
#> 17      C 2011 100.00
#> 18      C 2012 101.00
#> 19      C 2013 102.00
#> 20      C 2014 103.00
#> 21      C 2015 104.00
#> 22      C 2016 105.00
#> 23      C 2017 106.00
#> 24      C 2018 107.00