I am starting to use R and have a bit of a problem. I have a dataset containing 20 points where leaf temperature and respiration is measured called ADC_dark.
I expect an exponential relationship where an increase in leaf temperature results in increased respiration Then I plotted an exponential curve through this graph:
ADC_dark %>%
ggplot(aes(x=Tleaf, y=abs_A))+
geom_point()+
stat_smooth(method='lm', formula = log(y)~x)+
labs(title="Respiration and leaf temperature", x="Tleaf", y="abs_A")
This is not looking very good. The formula matching this line is y = -2.70206 * e^(0.11743*x)
Call:
lm(formula = log(ADC_dark$abs_A) ~ ADC_dark$Tleaf)
Residuals:
Min 1Q Median 3Q Max
-2.0185 -0.1059 0.1148 0.2698 0.6825
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.70206 0.51255 -5.272 5.18e-05 ***
ADC_dark$Tleaf 0.11743 0.02161 5.435 3.66e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.5468 on 18 degrees of freedom
Multiple R-squared: 0.6213, Adjusted R-squared: 0.6003
F-statistic: 29.54 on 1 and 18 DF, p-value: 3.659e-05
When I use the same data in excel I get this:
As you can see the intercept between these suggested exponential relationships differs. Just looking at the pictures I would say that excel is doing a better job.
How can I 'train' R to make a better fitted curve through my data, or am I misinterpreting something?
The problem is that when you fit inside ggplot2
start smooth using log(y) ~ x
it occured that scales of your data points and fitted line are different. Basically you plot y
and log(y)
at the same y
scale and since y > log(y)
for any positive y
your fitted plot shifted lower than your data point.
You have several options like to tweak axises and scales, or just use glm
generalized linear model with log link instead of lm
. In this case the scales would be presevered, no additional tweaking.
library(ggplot2)
set.seed(123)
ADC_dark <- data.frame(Tleaf = 1:20,
abs_A = exp(0.11*x - 2.7 + rnorm(1:20) / 10))
ADC_dark %>%
ggplot(aes(x = Tleaf, y = abs_A))+
geom_point()+
geom_smooth(method = "glm", type = "response", formula = y ~ x, method.args = list(family = gaussian(link = "log")))+
labs(title = "Respiration and leaf temperature", x = "Tleaf", y = "abs_A")
Output: