Search code examples
rggplot2ggpmisc

Trend line changes depending on axis scale in ggplot2


I'm using ggplot2 and ggmics to generate a trend line.

library(ggplot2)
library(ggpmisc)
library(scales)

x <- c(5,2,6,8,9,1,3,6,8,2)
y <- c(4,7,2,5,7,9,5,2,1,3)
df <- data.frame(x,y)

g <- ggplot(df,aes(x,y))
g <- g + geom_point(colour = "black")
g <- g + stat_smooth(method = lm, formula = y ~ x, se = FALSE)
g <- g + stat_poly_eq(formula = y ~ x,
                      aes(label = paste(stat(eq.label),stat(rr.label),stat(adj.rr.label),stat(p.value.label),sep = "~~~")),
                      label.x = "right",label.y = "bottom",parse = TRUE)

gg <- g + coord_trans(y = "identity")
gg <- g + coord_trans(y = "log")
gg <- g + scale_y_log10(breaks=10^(0:3),
                          labels=trans_format("log10",math_format(10^.x)))

1

  • For g, the label is y = 6.26 - 0.351x
  • For gg, it is y = 0.782 - 0.0419x

When I use exponential notation, the value of eq.label changes. I think the eq.label should not change because I am only changing the axis scale. Can you tell me why?


Solution

  • Main thing to understand is in my opinion this information provided on the link below:

    • The difference between transforming the scales and transforming the coordinate system is that scale transformation occurs BEFORE statistics, and coordinate transformation afterwards. Coordinate transformation also changes the shape of geoms:
    1. In case of transforamtion (scale) is BEFORE statistics, decreasing the errors of the sum of squares is performed on the transformed data.This would be ok if the relation is linear to the log of the variable.

    2. This changes if you transform the coordinates because here the statistics is performed AFTER transformation. e.g. decreasing the errors of sum of squares is performed on the untransformed data.

    See here: https://ggplot2.tidyverse.org/reference/coord_trans.html