Search code examples
rggplot2curve-fittingexponential

Fitting an exponential curve through scatterplot


I am starting to use R and have a bit of a problem. I have a dataset containing 20 points where leaf temperature and respiration is measured called ADC_dark.

I expect an exponential relationship where an increase in leaf temperature results in increased respiration Then I plotted an exponential curve through this graph:

ADC_dark %>%
    ggplot(aes(x=Tleaf, y=abs_A))+
    geom_point()+
    stat_smooth(method='lm', formula = log(y)~x)+
    labs(title="Respiration and leaf temperature", x="Tleaf", y="abs_A")

Respiration-Leaf temperature in R

This is not looking very good. The formula matching this line is y = -2.70206 * e^(0.11743*x)

Call:
lm(formula = log(ADC_dark$abs_A) ~ ADC_dark$Tleaf)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.0185 -0.1059  0.1148  0.2698  0.6825 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)    -2.70206    0.51255  -5.272 5.18e-05 ***
ADC_dark$Tleaf  0.11743    0.02161   5.435 3.66e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.5468 on 18 degrees of freedom
Multiple R-squared:  0.6213,    Adjusted R-squared:  0.6003 
F-statistic: 29.54 on 1 and 18 DF,  p-value: 3.659e-05

When I use the same data in excel I get this: Respiration - leaftemperature excel

As you can see the intercept between these suggested exponential relationships differs. Just looking at the pictures I would say that excel is doing a better job.

How can I 'train' R to make a better fitted curve through my data, or am I misinterpreting something?


Solution

  • The problem is that when you fit inside ggplot2 start smooth using log(y) ~ x it occured that scales of your data points and fitted line are different. Basically you plot y and log(y) at the same y scale and since y > log(y) for any positive y your fitted plot shifted lower than your data point.

    You have several options like to tweak axises and scales, or just use glm generalized linear model with log link instead of lm. In this case the scales would be presevered, no additional tweaking.

    library(ggplot2)
    set.seed(123)
    ADC_dark <- data.frame(Tleaf = 1:20, 
                           abs_A = exp(0.11*x - 2.7 + rnorm(1:20) / 10))
    
    ADC_dark %>%
      ggplot(aes(x = Tleaf, y = abs_A))+
      geom_point()+
      geom_smooth(method = "glm", type = "response", formula = y ~ x, method.args = list(family = gaussian(link = "log")))+
      labs(title = "Respiration and leaf temperature", x = "Tleaf", y = "abs_A")
    

    Output:

    enter image description here