Search code examples
rparametersinterpolationdata-fitting

Discrepancy between y values calculated using predict() or using explicit fitting equation


If I calculate the y value for a specific x value using predict() function I obtain a value different from the one I can calculate using the explicit fitting equation.

I fitted the data below using nls(MyEquation) and obtained the m1, m2,... parameters. Then, I want to reverse calculate the y value for a specific x value using both the predict(m) function or the explicit equation I used for fitting (putting in the desired x value). I obtain different y values for the same x value. Which one is the correct one?

> df
    pH activity
1  3.0     0.88
2  4.0     1.90
3  5.0    19.30
4  6.0    70.32
5  7.0   100.40
6  7.5   100.00
7  8.0    79.80
8  9.0     7.75
9 10.0     1.21

x <- df$pH
y <- df$activity
m<-nls(y~(m1*(10^(-x))+m2*10^(-m3))/(10^(-m3)+10^(-x)) - (m5*(10^(-x))+1*10^(-i))/(10^(-i)+10^(-x)), start = list(m1=1,m2=100,m3=7,m5=1))

> m
Nonlinear regression model
  model: y ~ (m1 * (10^(-x)) + m2 * 10^(-m3))/(10^(-m3) + 10^(-x)) - (m5 *     (10^(-x)) + 1 * 10^(-i))/(10^(-i) + 10^(-x))
   data: parent.frame()
      m1       m2       m3       m5 
-176.032   13.042    6.282 -180.704 
 residual sum-of-squares: 1522

Number of iterations to convergence: 14 
Achieved convergence tolerance: 5.805e-06

list2env(as.list(coef(m)), .GlobalEnv)

#calculate y based on fitting parameters
# choose the 7th x value (i.e. x[7]) that corresponds to pH = 8
# (using predict)
> x_pH8 <- x[7]
> predict(m)[7]
[1] 52.14299

# (using the explicit fitting equation with the fitted parameters
> x1 <- x_pH8
> (m1*(10^(-x1))+m2*10^(-m3))/(10^(-m3)+10^(-x1)) - (m5*(10^(-x1))+1*10^(-8.3))/(10^(-8.3)+10^(-x1))
[1] 129.5284

As you can see: predict(m)[7] gives y = 52.14299 (for x = 8)

while

(m1*(10^(-x1))+m2*10^(-m3))/(10^(-m3)+10^(-x1)) - (m5*(10^(-x1))+1*10^(-8.3))/(10^(-8.3)+10^(-x1)) gives y = 129.5284 (for x = 8)


Solution

  • The value of i you use in the manual calculation is probably not the same as the one you use in the model fitting. I don't get any discrepancy:

    x <- df$pH
    y <- df$activity
    
    i <- 8.3
    
    m <- nls(y~(m1*(10^(-x))+m2*10^(-m3))/(10^(-m3)+10^(-x)) - (m5*(10^(-x))+1*10^(-i))/(10^(-i)+10^(-x)), start = list(m1=1,m2=100,m3=7,m5=1))
    
    x <- 8
    with(as.list(coef(m)), 
         (m1*(10^(-x))+m2*10^(-m3))/(10^(-m3)+10^(-x)) - (m5*(10^(-x))+1*10^(-i))/(10^(-i)+10^(-x)))
    # [1] 75.46504
    
    predict(m)[7]
    # [1] 75.46504