Search code examples
rggplot2legendlegend-propertiesx-axis

Multiple Data Frame Plot Custom Legend and X Axis Breaks?


The data is 1965-2019 Renewable Energy Vs Alternative Energy Sources. I have two issues I am trying to tackle here. I would like to 1) Add a custom legend signifying the two colors and naming them accordingly. 2) I want to know if there is an easy way to create breaks in the x-axis without working too hard, essentially only showing every other year or every 5 years as breaks for better interpretability. My code and the current output plot are below. It is an overlay of two data frames, each having a simple x and y variable.

renew_plot <- ggplot(NULL, aes(x, y)) +
  xlab('Years') +
  ylab('Terawatt/Hr') +
  ggtitle('Global Energy Consumption By Year') +
  geom_point(data = renew_tot_energy_con_by_year, col = "green") +
  geom_point(data = oth_tot_energy_con_by_year, col = "blue")
renew_plot

The main issue I am running into is figuring out the legend since they are from separate data frames. Is there a way I can create a custom legend with two simple geom points that I can match the same color as my plot? That would be the easiest I believe. I am not sure the best way to go about this.enter image description here


Solution

  • I'm guessing from your squished x axis that your x data in at least one of the two data sets is in character or factor format, so ggplot2 is not automatically selecting "pretty" breaks but rather showing all of them. You could either convert the x values to numeric (e.g. a$x = as.numeric(a$x) if its character or a$x = as.numeric(as.character(a$x))` if its factor) or specify discrete breaks like below.

    To get the source into the legend, you could map the source name to the color aesthetic.

    a <- data.frame(x = as.character(1990:2021), y = mtcars$mpg)
    b <- data.frame(x = as.character(1990:2021), y = mtcars$hp)
    library(ggplot2)
    ggplot(NULL, aes(x,y)) +
      geom_point(data = a, aes(color = "a")) +
      geom_point(data = b, aes(color = "b")) +
      scale_x_discrete(breaks = seq(1990, 2020, by = 5))
    

    Output without the scale_x_discrete part:

    enter image description here

    Output with the scale_x_discrete part: enter image description here