Search code examples
rtime-seriesregressionexponentialnls

How to fit a function that is an exponential function times a sum of two sinusoidal functions


I am having difficulty getting nls or nlsLM to fit a function to the following time series in R.

  [1] 143.7083 143.7083 141.5833 139.8750 138.8333 137.9167 136.4583 134.8750 134.3750
 [10] 134.1250 133.7917 132.5000 129.2083 127.2917 127.9583 129.1667 129.8333 130.0000
 [19] 130.5000 131.1667 131.8750 132.2500 131.5417 132.1667 133.5417 133.7500 133.7500
 [28] 134.2083 134.7083 135.2500 136.2083 136.7083 136.7500 135.2083 134.4167 134.6667
 [37] 134.6667 135.4583 136.3750 136.5000 137.3750 138.2083 137.4167 137.0833 137.0417
 [46] 138.2500 139.5000 140.0417 139.1667 138.0833 137.8750 138.1250 137.5000 136.0417
 [55] 135.7917 135.2083 135.4583 136.5000 136.5417 135.2083 133.7083 132.0000 131.4583
 [64] 131.9167 132.4167 132.2500 132.2500 134.1250 134.7083 133.7083 133.0833 131.9167
 [73] 130.8750 131.7500 131.5833 129.4167 127.4167 127.4583 127.3750 125.7500 124.7500
 [82] 124.6667 125.5833 129.4583 132.2917 131.8333 132.2083 134.2917 136.7500 138.2917
 [91] 139.0000 138.5417 138.9167 138.5000 137.3333 136.3750 135.4167 133.4167 130.5000
[100] 128.8333 127.6667 125.8750 125.0833 126.1667 125.8750 126.0000 127.1667 124.6250
[109] 122.2917 124.4167 126.9583 128.2083 129.4167 131.5000 132.2500 132.6250 134.1250
[118] 135.5000 135.4583 137.5833 141.1667 142.0000 139.6667 138.0417 137.7917 136.2917
[127] 135.2500 134.4583 133.8750 132.8333 132.7083 131.1250 127.3750 126.1250 129.5417
[136] 130.6667 128.4167 127.9167 128.6250 129.1250 129.8750 131.0833 131.2083 132.3333
[145] 134.5000 134.5833 132.5000 132.2083 134.0833 135.2500 134.1667 132.7083 130.5417
[154] 128.5833 127.4167 125.5000 124.0000 122.2500 122.5833 124.0000 123.1250 122.2500
[163] 122.8333 123.1667 122.6667 122.6250 123.9167 124.6250 125.6667 128.1250 128.5000
[172] 127.2500 127.3750 128.1250 128.3333 128.3750 129.0417 128.7500 127.7500 130.1667
[181] 131.2500 130.3333 129.4583 127.8333 127.6250 127.0417 125.8333 125.8750 126.8750
[190] 127.7917 127.7083 126.7917 126.0833 125.4583 126.5000 128.0000 127.0833 126.5000
[199] 127.4167 128.3750 128.2917 127.8750 128.4583 128.5833 130.3750 132.2500 131.0833
[208] 130.9167 130.7917 130.1250 130.3333 130.0833 129.0833 128.1667 126.2083 124.5417
[217] 122.4167 119.4583 119.9583 121.3333 122.9583 125.3333 126.7500 127.6667 128.5417
[226] 129.7500 132.2083 133.1667 132.7083 133.6250 132.2917 130.4583 130.3750 129.9583
[235] 129.2917 128.2917 128.0833 127.8750 126.2500 125.3750 124.5417 123.7917 124.2917
[244] 124.0417 123.9583 123.7500 122.9167 122.7083 123.4583 125.1250 127.0417 128.1250
[253] 128.8333 130.0417 131.0833 131.0417 129.7917 129.4167 129.8750 129.7917 129.7500
[262] 130.9167 131.8750 133.7500 137.7083 139.3333 139.5000 140.3750 141.2917 140.9167
[271] 140.2500 140.5000 140.1667 137.5000 135.1250 135.1667 134.7917 134.3750 136.0000
[280] 138.3333 138.5000 138.9167 139.4583 139.2917 139.5833 140.2917 140.9583 139.5833
[289] 139.1250 139.5833 137.9583 136.6667 136.7500 135.7083 134.5833 134.4583 133.5833
[298] 132.0000 130.9167 130.7500 130.2917 130.8750 131.2917 129.1250 128.4167 130.1667
[307] 131.1250 131.6667 132.4583 134.3333 137.2500 138.5000 137.1250 137.2917 140.1667
[316] 143.3750 144.4167 144.1250 144.8333 144.5833 143.8750 142.0417 137.7083 133.7083
[325] 130.1250 125.3750 122.7917 122.5417 123.0000 122.7917 122.7083 124.5000 126.9583
[334] 129.1250 132.5833 136.3750 139.8333 140.2917 137.0417 135.3750 134.9583 135.6250
[343] 135.7917 135.3750 134.4167 132.8750 131.7917 131.0833 131.2917 131.2917 132.0417
[352] 132.6250 132.4583 131.2917 129.4583 127.8333 127.0000 127.4583 128.1667 129.4167
[361] 130.5417 132.4583 133.0000 132.8750 133.8750 134.5833 135.9167 137.4167 138.5833
[370] 139.0833 137.5000 134.5833 131.3333 131.0833 132.4167 132.1250 131.2083 131.0833
[379] 131.2917 129.1250 126.5833 125.3750 125.7500 126.5833 127.1250 126.9583 127.2083
[388] 126.7500 126.7917 127.2917 127.0000 127.8333 128.4167 128.9167 129.8333 131.9583
[397] 135.0833 136.5417 137.0000 137.7500 136.7083 135.3750 135.2083 135.9167 137.0417
[406] 138.4167 139.7500 139.0833 137.5417 138.9583 139.8333 139.0000 139.5833 140.7500
[415] 141.2083 140.7917 139.9167 139.7500 139.1250 139.2500 139.5000 136.3333 135.5000
[424] 137.7917 139.5000 139.1250 138.4167 138.3333 138.6667 138.6667 140.2500 140.2083
[433] 139.0417 139.9583 139.5000 138.1667 137.0833 137.7083 138.9583 139.1667 138.6250
[442] 138.1667 136.0417 135.1250 135.9167 135.2500 134.2917 133.5000 132.8750 132.9167
[451] 133.0000 133.4167 134.4167 133.2083 131.9583 132.5833 132.2500 132.5833 133.6667
[460] 133.9583 133.7917 133.1250 132.7083 133.2500 132.9583 132.0417 131.9167 131.0000
[469] 132.2083 134.1667 134.1667 134.3750 134.3333 134.9583 135.1250 133.7917 133.9167
[478] 136.4583 137.5833 134.7083 133.1667 135.5000 137.6250 138.4583 138.5833 137.7917
[487] 137.7083 138.2917 137.4167 136.7917 137.7500 142.2500 144.8750 142.8750 141.8750
[496] 142.1250 144.0833 145.5000 145.2917 145.4167 145.9583 145.5000 144.4167 143.4167
[505] 142.5833 142.0833 140.7917 139.1667 136.5000 133.6250 132.5417 131.4583 131.0000
[514] 130.5833 129.9167 129.8333 130.0833 129.1667 127.4167 127.0833 128.1667 130.1250
[523] 131.8333 133.5833 134.5417 134.9583 135.1250 133.9583 131.0833 129.9167 131.6250
[532] 134.0000 134.4167 134.1667 134.9583 134.2500 133.0417 132.5000 133.2917 135.9583
[541] 139.8333 141.8333 142.2083 141.7500 141.8333 142.2917 141.7500 142.7917 144.4167
[550] 146.0417 145.0417 140.7083 137.0833 135.7917 134.0833 131.9583 130.9167 130.1250
[559] 128.6667 126.3333 125.0417 124.2083 123.4167 125.0833 127.3333 129.1250 132.0000
[568] 132.9167 132.1250 131.1250 131.0000 131.7083 131.8750 132.1250 133.7500 134.7917
[577] 136.1250 138.0000 138.3333 138.9583 139.8750 140.5833 140.6250 140.9583 142.3750
[586] 143.2500 143.1250 143.3750 143.1667 141.8333 140.2083 139.4583 139.9583 140.4583
[595] 141.4583 142.3333 141.6250 140.4583 140.9583 141.7917 140.2917 139.0833 140.7083
[604] 142.0417 142.2917 143.2083 142.1667 140.1250 139.1667 139.0833 137.8750 134.3750
[613] 132.9167 133.2083 131.4167 129.9583 129.3750 128.1667 127.1667 126.7500 126.0833
[622] 124.4167 124.6250 127.7500 130.3333 130.5833 131.1667 132.3333 132.4167 131.9583
[631] 132.9167 134.2917 135.6250 137.4583 137.2917 134.9583 131.2500 127.6667 126.2500
[640] 125.1667 123.4583 122.4167 121.9583 121.7083 121.6250 120.8750 121.2917 122.7917
[649] 123.1667 123.6667 124.0000 124.4167 127.4167 129.8750 129.8333 129.2500 128.4583
[658] 128.4167 127.7500 127.0833 128.5417 130.1667 129.1250 127.4167 124.2083 121.6250
[667] 122.2500 123.2917 124.2500 125.1667 126.2083 125.8333 126.7500 128.2500 128.9167
[676] 130.0833 131.4167 132.2500 132.0833 131.8750 130.6667 129.0000 128.7083 130.7500
[685] 131.8333 129.8750 128.8750 128.5833 128.0417 128.2083 127.7917 127.3333 127.9167
[694] 130.3750 130.9167 128.8750 127.1250 128.2500 130.7500 131.9583 132.5417 133.0417
[703] 134.0000 133.2500 132.6250 130.8333 128.7917 129.5417 129.2917 128.0000 126.9583
[712] 126.1667 127.0417 127.4167 126.6250 127.8750 129.0000 129.6250 132.1667 132.8750
[721] 132.5417 131.6667 131.4167 133.2917 134.5000 136.4583 139.1250 140.2500 141.0417
[730] 142.0417 141.2917 138.9167 137.0000 138.5833 139.8333 139.4583 140.0833 140.0000
[739] 140.0417 140.3750 140.7083 141.0833 142.2500 146.5417 151.9583 154.7500 157.3750
[748] 158.8333 158.5833 158.5000 158.2917 159.0417 158.9583 159.0000 158.0833 156.9167
[757] 156.9167 155.7917 152.2917 149.7917 149.2500 149.0000

I have an equation that looks like fits it pretty well when plotted against the time series.

(-6.1253+0.0859744+2.14215 + -6.1253*0.0859744*-7.19365e-10*1.00001^(10600.6546*
((1:762) + -510.4607))) * (sin((((1:762) + 4624.28)/3699.41)^7.11555 + -5.20009) + 
sin((((1:762) + -0.873172)/48.5129)^1.06537 + -9.29727))

When I use nls or nlslm and use the parameters it gives to plot the equation against the data, the equation strays from the data near the end where the exponential behavior becomes significant, even when I only leave one of the numbers as a parameter.

mntm_trend_fit_sin_exp=nlsLM(mntm_trend_num ~ 133.495+0.909436 + 
(-6.1253+0.0859744+2.14215 + -6.1253*0.0859744*-7.19365e-10*1.00001^(inexpco*((1:762) + 
-510.4607))) * (sin((((1:762) + 4624.28)/3699.41)^7.11555 + -5.20009) + sin((((1:762) + 
-0.873172)*48.5129)^1.06537 + -9.29727)), start=list(inexpco=10600.6546), trace=TRUE)

mntm_trend_num is the posted time series.

The aforementioned function has a significantly better RMSE and MAE, yet 'nls' and 'nlslm' strays away from my starting point of inexpco=10600.

Any and all suggestions are invited.

Thank you in advance.


Solution

  • I'm not sure if I read the data correctly or not, but there seems to be 762 observations and you have 768 hard coded into your objective function. Here's the summary I get

    summary(mntm_trend_num)
    #    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    #   119.5   128.7   132.7   133.5   137.8   159.0 
    

    And let's define an objective function that we want to seta function that accepts inexpco as a parameter and calculates the values

    inexpcoFn <-function(x, N=762) {
        133.495+0.909436 + 
        (-6.1253+0.0859744+2.14215 +
        -6.1253*0.0859744*
        -7.19365e-10*1.00001^(x*((1:N) + -510.4607))) * 
        (sin((((1:N) + 4624.28)/3699.41)^7.11555 + -5.20009) + 
        sin((((1:N) + -0.873172)*48.5129)^1.06537 + -9.29727))
    }
    

    so running the optimization gives

    library(minpack.lm)
    mntm_trend_fit_sin_exp <- nlsLM(mntm_trend_num~inexpcoFn(inexpco), 
         start=list(inexpco=10600.6546), trace=TRUE)
    

    So the max best fit is at

    coef(mntm_trend_fit_sin_exp)
    #  inexpco 
    # 9949.674 
    

    This has a lower sum of squared errors than your starting point

    sum((inexpcoFn(10600.6546)-mntm_trend_num)^2)
    # [1] 68096.25
    sum((inexpcoFn(coef(mntm_trend_fit_sin_exp))-mntm_trend_num)^2)
    # [1] 32313.87
    

    And if we compare your "good fit"

    goodFn<-function(N=762) {
        (-6.1253+0.0859744+2.14215 + -6.1253*0.0859744*-7.19365e-
         10*1.00001^(10600.6546*((1:N) + -510.4607))) * 
        (sin((((1:N) + 4624.28)/3699.41)^7.11555 + -5.20009) + 
         sin((((1:N) + -0.873172)/48.5129)^1.06537 + -9.29727))
    }
    
    sum((goodFn()-mntm_trend_num)^2)
    # [1] 13581976
    

    Again we see a big reduction in the overall error using nlsLM so i'm confused how you were evaluating the result.