Search code examples
rnlsstandard-errorsine-wave

Testing how well my test data fits to a sinusoidal template in R


In my dataset, I have split my data into testing and training. I have used the training data to fit a sinusoidal curve.

freq = seq(from=21, to=80, by=0.5)
amp = c(-45.22247,-44.87434,-44.30659, -44.104, -44.08027 ,-44.32827, -44.52562, -44.81116, -45.11858, -45.61112, -45.88737, -45.9517, -46.00517,-45.78666, -45.64515, -45.32656, -45.12009, -44.92572, -44.58918,-44.18919, -43.75679, -43.4801, -43.13948, -42.86826 ,-42.59199,-42.23767, -42.06125, -41.99265 ,-41.91103, -41.96361, -42.10183,-42.47087, -42.86456, -43.17823 ,-43.41674, -43.6427 ,-43.79429,-43.71881, -43.58482 ,-43.53886, -43.29379, -43.00895 ,-42.69728,-42.331 ,-42.11675, -41.84402, -41.63068 ,-41.28213, -40.88509,-40.64461 ,-40.47236 ,-40.49424, -40.73463, -41.02037, -41.49101, -41.95741 ,-42.36682, -42.83582 ,-43.08535, -43.16926, -43.20281,-43.08085, -43.08461, -42.96666, -42.70192, -42.57608 ,-42.30105,-42.01737, -41.58435 ,-41.24757, -40.96 ,-40.68981 ,-40.39803,-40.2478, -40.23024, -40.48872, -40.68104, -41.04532, -41.53119,-41.88053, -42.17658, -42.32618, -42.41549 ,-42.36513, -42.38881,-42.25508, -42.13853, -41.89402, -41.66664, -41.3869, -41.02896, -40.83986, -40.44932 ,-40.14661, -39.84292, -39.66413, -39.60376, -39.63048, -39.75595 ,-39.88888, -40.18054 ,-40.491, -40.62274, -40.8282, -40.92761, -41.1448, -41.0102 ,-41.06486, -40.96643,-40.93895 ,-40.72338, -40.58406, -40.44275 ,-40.28842, -40.23578, -40.04179, -39.93855, -39.94855, -39.90564)
ssp <- data.frame(freq, amp)

A<- (max(ssp$amp)-min(ssp$amp)/2)
C<-((max(ssp$amp)+min(ssp$amp))/2)

res1<- nlsLM(amp ~ A*sin(omega*freq+phi)+C, data=ssp, start=list(A=A,omega=pi/6,phi=1,C=C))
sumres=summary(res1)
co <- coef(res1)
#resid(res1)
#sum(resid(res1)^2)
fit <- function(x, a, b, c, d) {a*sin(b*x+c)+d}
# Plot result
plot(ssp)
curve(fit(x, a=co["A"], b=co["omega"], c=co["phi"], d=co["C"]), add=TRUE ,lwd=2, col="steelblue")

This provides me with this fit. Please note that I know the fit could be better, but that isn't necessary for the purposes of this post. It's close enough.

Modeled fit sine wave in blue, over the raw sine data.

sumres

Formula: amp ~ A * sin(omega * freq + phi) + C

Parameters:
       Estimate Std. Error  t value Pr(>|t|)    
A      -1.00030    0.20105   -4.975  2.3e-06 ***
omega   0.53007    0.01196   44.332  < 2e-16 ***
phi    -0.31121    0.64156   -0.485    0.629    
C     -42.19834    0.14390 -293.248  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.552 on 115 degrees of freedom

Number of iterations to convergence: 7 
Achieved convergence tolerance: 1.49e-08

Now, I have testing data, that I would like to compare this modeled sine wave to. In theory, I should be able to once again use the nls (nonlinear least squares) function to get the residual standard error value, to understand how well this template fits other data. Rather than 'start' any of the variables, I simply inform the function what they are. However, I receive the following error when I do so:

newssp = similar to data above
res2<- nls(amp ~ co["A"]*sin(co["omega"]*freq+co["phi"])+co["C"], data=newssp)
Error in getInitial.default(func, data, mCall = as.list(match.call(func,  : 
  no 'getInitial' method found for "function" objects

It is my understanding that this is because the provided values may not be the best fit, but that is what I am looking for. I want to see how well my modeled sine wave fits against raw data. The only thing I can think of, is that for my coefficients, not all of them were significant.

Is there a better way to test how my modeled sine wave compares against raw data? Thanks!


Solution

  • How about instead giving your previously fitted values as the starting values and telling nls (via nls.control()) not to do any iterations of the fitting algorithm?

    res2 <- nls(amp ~ A*sin(omega*freq+phi)+C, data=newdata, 
        start=as.list(co), 
        control = nls.control(maxiter=0, warnOnly = TRUE))