The values list below contains annual indices of sunspot activity from 1900-1923. I need to fit a sin curve to this data with a period of 10.4 years, with lm() in R. Despite reading other related posts, I couldn't figure this out yet. The dependent will be the observed values, but for the regressor sin() I am not sure how to set the period as 10.4 years. Could anyone give me a hint?
values <- c(113.5,32.9, 60.3,92.6,503.4,761.6,646.3,744.4, 582.5,526.6,223.0, 68.4, 43.1,17.3, 115.1,568.4,684.8, 1246.7,966.9,763.3,
451.7,313.6,170.9,69.3)
lm()
is made for models that are linear in parameters, which is not the case of sine models.
Such models can be estimate with nls()
, which is made for nonlinear models.
The general form of the model would be: y = a + b*sin(c + d*x)
With your hypothesis d
would be around 2*pi/10.4
, and we can set a
and b
to the average y value and c
to 0.
This results in the following expected fit:
library(tidyverse)
df <- tibble(year = 1900:1923,
values = values) %>%
mutate(expected_fit = mean(values) + sin(year * (2*pi/10.4)) * mean(values))
ggplot() +
geom_point(data = df, aes(x = year, y = values), color = "blue") +
geom_line(data = df, aes(x = year, y = expected_fit), color = "red")
We can now use nls()
to fit this model, specifying our hypothesized parameters for the function to start there.
res <- summary(nls(values ~ a + b*sin(c + d*year), data = df,
start = list(a = mean(df$values), b = mean(df$values),
c = 0, d = 2*pi/10.4)))$coefficients
res
Estimate Std. Error t value Pr(>|t|)
a 442.371777 27.12392790 16.309282 5.093534e-13
b 439.541330 35.88116525 12.249918 9.437063e-11
c 54.939557 26.43198489 2.078526 5.074681e-02
d 0.575082 0.01382716 41.590751 6.740719e-21
We can now check that nls
converged towards appropriate coefficients by computing and plotting the fitted values from this model.
df <- df %>%
mutate(fit = res["a", 1]+res["b", 1]*sin(res["c", 1]+res["d", 1]*year))
ggplot() +
geom_point(data = df, aes(x = year, y = values), color = "blue") +
geom_line(data = df, aes(x = year, y = fit), color = "red")