Search code examples
rnon-linear-regression

Non linear fit with R


I try to obtain the first three coefficients for Cauchy's dispersion equation for Silicon. Using a csv containing the refractive index for some wavelengths (that you can find here), I try to fit the following model :

library(readr)
library(tidyverse)
library(magrittr)
library(modelr)
library(broom)
library(splines)

# CSV parsing
RefractiveIndexINFO <- read_csv("./silicon-index.csv")

# Cleaning the output of the csv-parsing
indlong = tibble(RefractiveIndexINFO$`Wavelength. µm`,RefractiveIndexINFO$n)
names(indlong) = c('w','n')

# Remove some wavelengths that might not fit
indlong_non_uv = indlong %>% filter(indlong$w >= 0.4)

# Renaming variables
w = indlong_non_uv$w
n = indlong_non_uv$n

# Creating the non linear model
model = nls(n ~ a + b*ns(w,-2) + c*ns(w,-4), data = indlong_non_uv)

# Gathering informations on the fitted model
cor(indlong_non_uv$n,predict(model))
tidy(model)

Which gives the following error :

Error in c * ns(w, -4) : non-numeric argument to binary operator

How can I circumvent this situation and get the three coefficients (a,b,c) in a row ?

Obviously, using model = nls(n ~ a + b*ns(w,-2), data = indlong_non_uv) does not give an error.


Solution

  • Try this:

    library(readr)
    library(tidyverse)
    library(magrittr)
    library(modelr)
    library(broom)
    library(splines)
    
    # CSV parsing
    RefractiveIndexINFO <- read_csv("aspnes.csv")
    RefractiveIndexINFO <- RefractiveIndexINFO[1:46,]
    RefractiveIndexINFO <- as.data.frame(apply(RefractiveIndexINFO,2,as.numeric))
    names(RefractiveIndexINFO) <- c('w','n')
    
    indlong_non_uv = RefractiveIndexINFO %>% filter(RefractiveIndexINFO$w >= 0.4)
    
    # Creating the nonlinear model
    model <- nls(n ~ a + b*w^(-2) + c*w^(-4), data = indlong_non_uv,
       start=list(a=1, b=1, c=1))
    
    # Gathering informations on the fitted model
    cor(indlong_non_uv$n,predict(model))
    # [1] 0.9991006
    
    tidy(model)
    #   term    estimate   std.error statistic      p.value
    # 1    a  3.65925186 0.039368851 92.947896 9.686805e-20
    # 2    b -0.04981151 0.024099580 -2.066904 5.926046e-02
    # 3    c  0.05282668 0.003306895 15.974707 6.334197e-10
    

    Alternatively, you can use linear regression:

    model2 <- lm(n ~ I(w^(-2)) + I(w^(-4)), data = indlong_non_uv)
    summary(model2)
    
    # Coefficients:
    #              Estimate Std. Error t value Pr(>|t|)    
    # (Intercept)  3.659252   0.039369  92.948  < 2e-16 ***
    # I(w^(-2))   -0.049812   0.024100  -2.067   0.0593 .  
    # I(w^(-4))    0.052827   0.003307  15.975 6.33e-10 ***