Search code examples
rpredictnlsmodel-fittingdrc

R: Extracting values of x from a logistic curve fitted with nls


I'm pretty sure this has a very easy solution but I've been scratching my head at this all day and cannot seem to see where the problem lies.

I am graphing some pharmacological dose-response data (which is working fine), however as an initial step I want convert my somewhat meaningless response variable (output from a plate reader) into readable drug concentrations by extrapolating from a standard curve which I have generated.

The data used for my standard curve is as follows:

structure(list(log_Conc = c(3.45453998496482, 2.85247999363686, 
2.25042000230889, 1.64836001098093, 1.04532297878666, 0.444044795918076, 
-0.161150909262745, -0.763210900590707, -5), Average_log_val = c(0.137454, 
0.1506735, 0.2221405, 0.3992095, 0.846017, 1.5389285, 2.1200975, 
2.3373605, 2.368048)), class = c("grouped_df", "tbl_df", "tbl", 
"data.frame"), row.names = c(NA, -9L), groups = structure(list(
    log_Conc = c(-5, -0.763210900590707, -0.161150909262745, 
    0.444044795918076, 1.04532297878666, 1.64836001098093, 2.25042000230889, 
    2.85247999363686, 3.45453998496482), .rows = structure(list(
        9L, 8L, 7L, 6L, 5L, 4L, 3L, 2L, 1L), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -9L), .drop = TRUE))

My standard curve is modelled as a 3 parameter logistic which I have coded as such:

standard_curve_logistic <- nlsLM(Average_log_val ~ lower + (upper - lower)/(1 + 10^(log_Conc-midpoint)),
                              data = Standard_cAMP_Curve,
                              start = list(lower = min(Standard_cAMP_Curve$Average_log_val), upper = max(Standard_cAMP_Curve$Average_log_val), midpoint = mean(Standard_cAMP_Curve$log_Conc)))

Plotting the fit of the logistic shows it to model my data nicely

r <- range(Standard_cAMP_Curve$log_Conc)
xNew <- seq(r[1],r[2],length.out = 200)
yNew <- predict(standard_curve_logistic,list(log_Conc = xNew))

plot(Standard_cAMP_Curve$log_Conc,Standard_cAMP_Curve$Average_log_val)
lines(xNew,yNew)

Standard Curve

I now wanted to use the standard curve to find a series of x values for given y. Thus I re-arranged the formula of the standard curve, to solve for x. However, doing so returns values which don't seem to be correct at all. I am pretty sure I have re-arranged my formula correctly, so I don't see what's causing the issue here.

params = coef(standard_curve_logistic)

logistic_solve_for_x <- function(upper, lower, midpoint,y_values) { ## Solve algebraic of 3 param logistic for x
  x_value = (log10((upper-lower/y_values-lower) - 1) + midpoint) 
  return(x_value)
}

y_vector <- c(0.5,1,1.5,2) ## Some example values of y to illustrate my point

logistic_solve_for_x(upper = params[2], midpoint = params[3], lower = params[1], y_values = y_vector) 

# The results of this function are: [1] 0.6535569 0.7194098 0.7393143 0.7489347

Reading off the standard curve, you can see these values of x are clearly incorrect (e.g. I'd expect an x of around 0 for a y of 2 etc.)

If anyone can see where I have gone wrong here, I would be so very grateful!


Solution

  • A few preliminary comments:

    • nls would have worked here too and it is normally preferable because nlsLM can suffer from premature convergence. It sometimes appears to work when nls gives an error but usually in those cases there are real problems which ought to be solved and the error in nls alerts us to them.

    • code to SO should be self contained. library(minpack.lm) was missing

    The code from the question in the Note at the end should be run before running any of the code below.

    Here are a couple of alternative approaches.

    1) uniroot We can use uniroot to numerically solve the inversion. f is a function of log_Conc using parameters y and model and returns the difference between the prediction at log_Conc and y. uniroot will solve for the root of that function equal to zero.

    We show the solved points in red on the plot below.

    f <- function(log_Conc, y, model) predict(model, data.frame(log_Conc)) - y
    x_vector <- sapply(
      y_vector, 
      \(y) uniroot(f, r, y = y, model = standard_curve_logistic)[[1]]
    )
    x_vector
    ## [1] 1.419654677 0.898921902 0.503686303 0.007363285
    
    plot(Average_log_val ~ log_Conc, Standard_cAMP_Curve)
    lines(xNew, yNew)
    points(y_vector ~ x_vector, col = "red", pch = 20, cex = 2)
    

    (continued after graph) screenshot

    2) Ryacas0 We can invert the formula using the yacas computer algebra system via the Ryacas0 package. Remove the underscores to do that and then it is straight forward to manually create an R function that uses that solution or we can pick apart the yacas output and generate a function. We see that the result is similar to (1) to several decimals.

    library(Ryacas0)
    
    s <- "Solve(lower+(upper-lower)/(1+10^(logConc-midpoint))-yvector, logConc)"
    yac <- yacas(s); yac
    ## Yacas vector:
    ## [1] logConc == midpoint + log(-((upper - lower)/(lower - yvector) + 1))/log(10)
    
    ff <- function(yvector, lower, upper, midpoint) {}
    body(ff) <- yac$text[[1:3]]
    
    ff
    ## function (yvector, lower, upper, midpoint) 
    ## midpoint + log(-((upper - lower)/(lower - yvector) + 1))/log(10)
    
    ff(y_vector, params[1], params[2], params[3])
    ## [1] 1.419654165 0.898930737 0.503666159 0.007366106
    

    Note

    Run this code from the question before running the code above

    library(minpack.lm)
    
    Standard_cAMP_Curve <- 
    structure(list(log_Conc = c(3.45453998496482, 2.85247999363686, 
    2.25042000230889, 1.64836001098093, 1.04532297878666, 0.444044795918076, 
    -0.161150909262745, -0.763210900590707, -5), Average_log_val = c(0.137454, 
    0.1506735, 0.2221405, 0.3992095, 0.846017, 1.5389285, 2.1200975, 
    2.3373605, 2.368048)), class = c("grouped_df", "tbl_df", "tbl", 
    "data.frame"), row.names = c(NA, -9L), groups = structure(list(
        log_Conc = c(-5, -0.763210900590707, -0.161150909262745, 
        0.444044795918076, 1.04532297878666, 1.64836001098093, 2.25042000230889, 
        2.85247999363686, 3.45453998496482), .rows = structure(list(
            9L, 8L, 7L, 6L, 5L, 4L, 3L, 2L, 1L), ptype = integer(0),
        class = c("vctrs_list_of", "vctrs_vctr", "list"))),
    class = c("tbl_df", "tbl", "data.frame"),
    row.names = c(NA, -9L), .drop = TRUE))
    
    
    standard_curve_logistic <- nlsLM(
      Average_log_val ~ lower + (upper - lower)/(1 + 10^(log_Conc-midpoint)),
      data = Standard_cAMP_Curve,
      start = list(lower = min(Standard_cAMP_Curve$Average_log_val),
                   upper = max(Standard_cAMP_Curve$Average_log_val), 
                   midpoint = mean(Standard_cAMP_Curve$log_Conc)
               )
    )
      
    r <- range(Standard_cAMP_Curve$log_Conc)
    xNew <- seq(r[1], r[2], length.out = 200)
    yNew <- predict(standard_curve_logistic, list(log_Conc = xNew))
    
    params = coef(standard_curve_logistic)
    y_vector <- c(0.5, 1, 1.5, 2)