Search code examples
rggplot2axis

Change x-axes label values in ggplot by utilizing different column value


I have the following code where I fit a log-linear model and plot the fit

r1 = lm(log10(df$size) ~ df$time + ...)
plt <- ggplot(data = df, aes(x=time, y=size))+
    geom_point(size = 0.1) +    # Scatter
    geom_line(aes(y = 10^predict(r1))) +
    scale_y_log10()

The dataframe df contains 3 columns, namely datetime, time and size. Time is a normalized conversion from datetime such that r1 can take float values instead of datetimes. Datetimes are incremented by 10 minutes.

datetime time size
2022-07-01 00:00:00 0 4
2022-07-01 00:00:10 600 34
2022-07-01 00:00:20 1200 12

The problem I am phasing now is that I want to replace the x-axes values by the datetime value instead of the time value.

We can use scale_x_continuous(breaks = df$time, labels = df$datetime), however, this will put x labels at every point. Instead, I want to use somehow the axes labels selected by default (probably using waiver()) and then select the value at the same index from df$datetime.

I tried using Manual addition of x-axes in ggplot with new labels?, however the difference for me is that I have another column with datetimes that I want to utilize.


Solution

  • As already mentioned by @Z.Lin, replacing the labels for the breaks with values from another column could in general be easily achieved but will only work if there is a corresponding value in your data for each of the default breaks set by ggplot2, which in general (see your example data) is not the case.

    One option would be to use interpolation to set the labels using e.g. Hmisc::approxExtrap:

    library(ggplot2)
    
    r1 <- lm(log10(df$size) ~ df$time)
    p <- ggplot(data = df, aes(x = time, y = size)) +
      geom_point(size = 0.1) + # Scatter
      geom_line(aes(y = 10^predict(r1))) +
      scale_y_log10() 
    
    p +
      scale_x_continuous(
        labels = function(x) {
          x[!is.na(x)] <- Hmisc::approxExtrap(df$time, df$datetime, xout = x[!is.na(x)])$y
          as.POSIXct(x, origin = "1970-01-01")
        }
      ) +
      theme(plot.margin = margin(r = 50))
    

    df <- data.frame(
      datetime = c(
        "2022-07-01 00:00:00",
        "2022-07-01 00:00:10", 
        "2022-07-01 00:00:20"
      ),
      time = c(0L, 600L, 1200L),
      size = c(4L, 34L, 12L)
    )
    
    df$datetime <- as.POSIXct(df$datetime)
    

    enter image description here