Search code examples
rggplot2facet-gridaesthetics

ggplot2 facet_grid, no legend, formatting


I´m running a study where I want wo display the results using facet_grid from ggplot2.

My testdata can be found here text

The Data has six columns, Ausfallrate (MissingRate), PFC_MCA, PFC_Hot, PFC_Mode, AnzahlI (Number of I), AnzahlJ (Number of J) I need to plot (scatterplot with connected lines) the variables PFC_MCA, PFC_Hot, PFC_Mode as y-values over the x-Values of the MissingRate. AnzahlI and AnzahlJ each have 3 levels, so 9 possible combinations and are used as facets. I have to plot the scatterplot for each of the 9 combinations using facet_grid.

This is basically the outcome I´m looking for. (The actual data in my testdate is identical for all 9 combinations, so the plots are identical) enter image description here

Problem with this is that I can´t get ggplot2 to print a legend. This plot was created using this

p<-2.5    #PointSize
lineweight<-0.8  #Lineweight
ggplot(test_outputdata,
        aes(x=Ausfallrate))+
        
        geom_point(aes(y=PFC_MCA),color="red",pch=15,cex=p)+
        geom_line(aes(y=PFC_MCA), color="red",linetype="solid",lwd=lineweight) + 
        
        geom_point(aes(y=PFC_Hot),color="blue", pch=16,cex=p)+
        geom_line(aes(y=PFC_Hot), color="blue",linetype="dashed",lwd=lineweight) +
        
        geom_point(aes(y=PFC_Mode),color="black",pch=17,cex=p)+
        geom_line(aes(y=PFC_Mode), color="black",linetype="dotdash",lwd=lineweight) +
        
        facet_grid(AnzahlJ~AnzahlI,
                     #Umbenennen bzw richtig schreiben der Labels
                     labeller = labeller(
                       AnzahlJ = c(`3` = "J=3", `6` = "J=6", `10` = "J=10"),
                       AnzahlI= c(`100` = "I=100", `500` = "I=500", `1000` = "I=1000"),
                       
                    )
                  )+
        labs(y= "PFC")+
        ggtitle("MCAR")

I figured out that it has something to do with the aes()-function. Lets ignore the shape, labels and lineweigts for a second and focus on color. If I put the color statement within the aes()-call like this:

ggplot(test_outputdata,
        aes(x=Ausfallrate))+
        
        geom_point(aes(y=PFC_MCA,color="red"))+
        geom_line(aes(y=PFC_MCA,color="red")) + 
        
        geom_point(aes(y=PFC_Hot,color="blue"))+
        geom_line(aes(y=PFC_Hot),color="blue") +
        
        geom_point(aes(y=PFC_Mode,color="black"))+
        geom_line(aes(y=PFC_Mode,color="black")) +
        
        facet_grid(AnzahlJ~AnzahlI)

The result now looks like this enter image description here

Nice, I get a legend. But for whatever reason the colors are wrong, the supposedly black line is red etc. I also can´t figure out how to adjust the point shapes or the lineweights this way. Here I found that I should be able to use e.g. shape=18 inside geom_point() to change the points. This sorta works (doing it outside aes()). Thing is, it changes the symbol in the plot as expected but in the legend it changes all points.

All I want to achieve is a working legend and beeing able to specify the colors, point symbols, size of symbols and lines and linetypes. I´ve also tried tfind something here but nothing really made sense to me.


Solution

  • There are quite a few 'requested changes' in your question, and I've done my best to address them, but if you have further tweaks you'd like to make (and you can't figure it out yourself) please feel free to leave a comment below and I'll take a look.

    The approach I've used is based on functions from the tidyverse library:

    Load the tidyverse package and example data:

    library(tidyverse)
    
    df <- read.table(text = "Ausfallrate    PFC_MCA PFC_Hot PFC_Mode    AnzahlI AnzahlJ
    0,1 0,2 0,1 0,2 100 3
    0,2 0,25    0,15    0,3 100 3
    0,3 0,3 0,2 0,4 100 3
    0,4 0,35    0,25    0,5 100 3
    0,5 0,4 0,3 0,6 100 3
    0,1 0,2 0,1 0,2 100 6
    0,2 0,25    0,15    0,3 100 6
    0,3 0,3 0,2 0,4 100 6
    0,4 0,35    0,25    0,5 100 6
    0,5 0,4 0,3 0,6 100 6
    0,1 0,2 0,1 0,2 100 10
    0,2 0,25    0,15    0,3 100 10
    0,3 0,3 0,2 0,4 100 10
    0,4 0,35    0,25    0,5 100 10
    0,5 0,4 0,3 0,6 100 10
    0,1 0,2 0,1 0,2 500 3
    0,2 0,25    0,15    0,3 500 3
    0,3 0,3 0,2 0,4 500 3
    0,4 0,35    0,25    0,5 500 3
    0,5 0,4 0,3 0,6 500 3
    0,1 0,2 0,1 0,2 500 6
    0,2 0,25    0,15    0,3 500 6
    0,3 0,3 0,2 0,4 500 6
    0,4 0,35    0,25    0,5 500 6
    0,5 0,4 0,3 0,6 500 6
    0,1 0,2 0,1 0,2 500 10
    0,2 0,25    0,15    0,3 500 10
    0,3 0,3 0,2 0,4 500 10
    0,4 0,35    0,25    0,5 500 10
    0,5 0,4 0,3 0,6 500 10
    0,1 0,2 0,1 0,2 1000    3
    0,2 0,25    0,15    0,3 1000    3
    0,3 0,3 0,2 0,4 1000    3
    0,4 0,35    0,25    0,5 1000    3
    0,5 0,4 0,3 0,6 1000    3
    0,1 0,2 0,1 0,2 1000    6
    0,2 0,25    0,15    0,3 1000    6
    0,3 0,3 0,2 0,4 1000    6
    0,4 0,35    0,25    0,5 1000    6
    0,5 0,4 0,3 0,6 1000    6
    0,1 0,2 0,1 0,2 1000    10
    0,2 0,25    0,15    0,3 1000    10
    0,3 0,3 0,2 0,4 1000    10
    0,4 0,35    0,25    0,5 1000    10
    0,5 0,4 0,3 0,6 1000    10", header = TRUE)
    

    Create the plot:

    df %>%
      mutate(across(where(is.character),
                    ~parse_number(.x, locale = locale(decimal_mark = ",")))) %>%
      pivot_longer(-c(Ausfallrate, AnzahlJ, AnzahlI),
                   names_to = "Type",
                   values_to = "PFC") %>%
      mutate(Type = factor(Type, levels = c("PFC_Hot", "PFC_Mode", "PFC_MCA"))) %>%
      ggplot(aes(x = Ausfallrate, y = PFC,
                 group = Type, color = Type,
                 shape = Type)) +
      geom_point() +
      geom_line() +
      facet_grid(rows = vars(AnzahlJ), cols = vars(AnzahlI),
                 labeller = labeller(
                   AnzahlJ = c(`3` = "J=3", `6` = "J=6", `10` = "J=10"),
                   AnzahlI= c(`100` = "I=100", `500` = "I=500", `1000` = "I=1000")
      )) +
      scale_color_manual(values = c("blue", "black", "red"),
                         labels = c("PFC_Hot", "PFC_Mode", "PFC_MCA")) +
      scale_shape_manual(values = c(1, 15, 18),
                         labels = c("PFC_Hot", "PFC_Mode", "PFC_MCA"))
    

    Created on 2023-05-19 with reprex v2.0.2