Search code examples
rggplot2axes

ggplot Axis scaling


Following script is used to plot data on one common x and two distinctly scaled y-axes:

scale <- max(data$one) / max(data$two)
#See https://stackoverflow.com/questions/3099219/plot-with-2-y-axes-one-y-axis-on-the-left-and-another-y-axis-on-the-right

ggplot(data3, aes(x=time)) +
  labs(title = "Title", x="Time") +
  geom_line(aes(x=time, y=one, col="cf_media")) +
  geom_line(aes(x=time, y=two * scale, col="two")) + 
  scale_x_date(breaks=scales::pretty_breaks(n = 6), expand = c(0,0)) + 
  scale_color_manual(values = c("red", "blue")) +
  scale_y_continuous(name="Left Axis",
                     sec.axis=sec_axis(~./scale, 
                                       breaks = scales::pretty_breaks(n = 6),
                                       name="Right Axis"),
                     minor_breaks=NULL,
                     breaks = scales::pretty_breaks(n = 6)) +
  theme(legend.position = c(.90, .95),
        legend.title=element_blank()) 

Out:

enter image description here

The problem with this graph is, that, for some reason, the left axis (red line) doesn't scale appropriately. The minimum axis value is way below the true min. Among 20 graphs, this is the only one that seems to have this error. The right axis (blue) seems just fine.

Question: What's the best way to plot these two time series appropriately with two distinct y-axes, possibly without adding limits=?

The data is pasted below.

Thanks for your help!

dput(data):

structure(list(time = structure(c(15431, 15461, 15492, 15522, 
15553, 15584, 15614, 15645, 15675, 15706, 15737, 15765, 15796, 
15826, 15857, 15887, 15918, 15949, 15979, 16010, 16040, 16071, 
16102, 16130, 16161, 16191, 16222, 16252, 16283, 16314, 16344, 
16375, 16405, 16436, 16467, 16495, 16526, 16556, 16587, 16617, 
16648, 16679, 16709, 16740, 16770, 16801, 16832, 16861), class = "Date"), 
    one = c(18.4221796200761, 18.3967231898903, 16.335633117503, 
    16.730296027773, 18.1514409360143, 17.7199441162588, 16.799170250284, 
    15.4179238554614, 17.4392839966129, 17.595792430154, 16.9553497988467, 
    16.4953670246957, 17.5849417055811, 17.9678266256, 17.1739918955819, 
    17.6002431353711, 17.9595179193721, 17.999935039935, 17.7524652108263, 
    17.7177489902007, 17.3588650113878, 17.6725182139017, 17.4405657957642, 
    17.6704974950091, 17.8875447511326, 16.9658703405016, 17.4780706514254, 
    18.1277851044477, 18.1761216072241, 17.1620759199987, 18.4686443856938, 
    18.5762732410043, 17.4880377648686, 17.829055609543, 18.3096877847122, 
    16.9076887297865, 17.8248723024602, 17.9126225373526, 17.6292589941241, 
    18.5179618982914, 18.5366398684046, 18.0677764393526, 16.6418177564593, 
    16.9211415340111, 17.9654709207359, 17.6361844282822, 18.2270338206168, 
    16.4429651207772), two = c(16.6582662387643, 16.6291067627203, 
    16.6371145606996, 16.4961540261816, 17.0596826781454, 16.7062765725755, 
    16.7162150801773, 16.5497433654548, 16.289748798776, 16.2175637229181, 
    16.2087465723472, 16.7917693011438, 17.0844034991311, 17.1252018488917, 
    17.035226133743, 16.7162471713475, 16.7691021105687, 16.2972221957997, 
    16.708788985733, 16.7827078467867, 16.4754347061264, 16.0403929689546, 
    16.2594260331233, 16.6004611360169, 16.6892567341726, 16.7962173532738, 
    16.4205670536356, 16.9984920528234, 16.3600444060327, 16.2367171945242, 
    16.5735658839198, 16.5205739998051, 16.483858284178, 15.7240973247951, 
    17.0266878151606, 16.9142806023417, 16.788529867979, 16.5935730790947, 
    16.424863694705, 17.0269967594955, 16.3278055932357, 16.3832866771369, 
    16.6118799527132, 16.8515435281064, 16.8155186621117, 15.5626066989097, 
    16.2053235276769, 16.9638862520006)), .Names = c("time", 
"one", "two"), row.names = c(NA, 48L), class = "data.frame")

Solution

  • If I understand correctly, the OP wants to scale the second time series two so that the scaled values lie within the same interval as the values of time series one.

    To achieve this, we need to scale the range as well as to consider an offset, i.e.,

    one = scl * two + ofs

    The values of scl and ofs can be determined from the min() and max() values of one and two, resp.:

    scl <- (max(dat$one) - min(dat$one)) / (max(dat$two) - min(dat$two))
    ofs <- min(dat$one) - scl * min(dat$two)
    
    library(ggplot2)
    ggplot(dat, aes(x = time)) +
      labs(title = "Title", x = "Time") +
      geom_line(aes(x = time, y = one, col = "one")) +
      geom_line(aes(x = time, y = two * scale + ofs, col = "two")) +
      scale_x_date(breaks = scales::pretty_breaks(n = 6), expand = c(0, 0)) +
      scale_color_manual(values = c("red", "blue")) +
      scale_y_continuous(
        name = "Left Axis",
        sec.axis = sec_axis(~ (. - ofs) / scl, 
                            name = "Right Axis"),
        minor_breaks = NULL,
        breaks = scales::pretty_breaks(n = 6)) +
      theme(legend.position = c(.90, .95),
            legend.title = element_blank())
    

    enter image description here

    Now, the min and max values, resp., of both time series are plotted at the same y-values.

    Please, note that the scale on the left side belongs to time series one (red line) and the scale on the right side belongs to times series two (blue line).