I´m running a study where I want wo display the results using facet_grid
from ggplot2
.
My testdata can be found here text
The Data has six columns, Ausfallrate (MissingRate), PFC_MCA, PFC_Hot, PFC_Mode, AnzahlI (Number of I), AnzahlJ (Number of J)
I need to plot (scatterplot with connected lines) the variables PFC_MCA, PFC_Hot, PFC_Mode as y-values over the x-Values of the MissingRate.
AnzahlI and AnzahlJ each have 3 levels, so 9 possible combinations and are used as facets. I have to plot the scatterplot for each of the 9 combinations using facet_grid
.
This is basically the outcome I´m looking for. (The actual data in my testdate is identical for all 9 combinations, so the plots are identical)
Problem with this is that I can´t get ggplot2
to print a legend.
This plot was created using this
p<-2.5 #PointSize
lineweight<-0.8 #Lineweight
ggplot(test_outputdata,
aes(x=Ausfallrate))+
geom_point(aes(y=PFC_MCA),color="red",pch=15,cex=p)+
geom_line(aes(y=PFC_MCA), color="red",linetype="solid",lwd=lineweight) +
geom_point(aes(y=PFC_Hot),color="blue", pch=16,cex=p)+
geom_line(aes(y=PFC_Hot), color="blue",linetype="dashed",lwd=lineweight) +
geom_point(aes(y=PFC_Mode),color="black",pch=17,cex=p)+
geom_line(aes(y=PFC_Mode), color="black",linetype="dotdash",lwd=lineweight) +
facet_grid(AnzahlJ~AnzahlI,
#Umbenennen bzw richtig schreiben der Labels
labeller = labeller(
AnzahlJ = c(`3` = "J=3", `6` = "J=6", `10` = "J=10"),
AnzahlI= c(`100` = "I=100", `500` = "I=500", `1000` = "I=1000"),
)
)+
labs(y= "PFC")+
ggtitle("MCAR")
I figured out that it has something to do with the aes()
-function. Lets ignore the shape, labels and lineweigts for a second and focus on color. If I put the color statement within the aes()
-call like this:
ggplot(test_outputdata,
aes(x=Ausfallrate))+
geom_point(aes(y=PFC_MCA,color="red"))+
geom_line(aes(y=PFC_MCA,color="red")) +
geom_point(aes(y=PFC_Hot,color="blue"))+
geom_line(aes(y=PFC_Hot),color="blue") +
geom_point(aes(y=PFC_Mode,color="black"))+
geom_line(aes(y=PFC_Mode,color="black")) +
facet_grid(AnzahlJ~AnzahlI)
The result now looks like this
Nice, I get a legend. But for whatever reason the colors are wrong, the supposedly black line is red etc.
I also can´t figure out how to adjust the point shapes or the lineweights this way.
Here I found that I should be able to use e.g. shape=18
inside geom_point()
to change the points. This sorta works (doing it outside aes()
). Thing is, it changes the symbol in the plot as expected but in the legend it changes all points.
All I want to achieve is a working legend and beeing able to specify the colors, point symbols, size of symbols and lines and linetypes. I´ve also tried tfind something here but nothing really made sense to me.
There are quite a few 'requested changes' in your question, and I've done my best to address them, but if you have further tweaks you'd like to make (and you can't figure it out yourself) please feel free to leave a comment below and I'll take a look.
The approach I've used is based on functions from the tidyverse library:
Load the tidyverse package and example data:
library(tidyverse)
df <- read.table(text = "Ausfallrate PFC_MCA PFC_Hot PFC_Mode AnzahlI AnzahlJ
0,1 0,2 0,1 0,2 100 3
0,2 0,25 0,15 0,3 100 3
0,3 0,3 0,2 0,4 100 3
0,4 0,35 0,25 0,5 100 3
0,5 0,4 0,3 0,6 100 3
0,1 0,2 0,1 0,2 100 6
0,2 0,25 0,15 0,3 100 6
0,3 0,3 0,2 0,4 100 6
0,4 0,35 0,25 0,5 100 6
0,5 0,4 0,3 0,6 100 6
0,1 0,2 0,1 0,2 100 10
0,2 0,25 0,15 0,3 100 10
0,3 0,3 0,2 0,4 100 10
0,4 0,35 0,25 0,5 100 10
0,5 0,4 0,3 0,6 100 10
0,1 0,2 0,1 0,2 500 3
0,2 0,25 0,15 0,3 500 3
0,3 0,3 0,2 0,4 500 3
0,4 0,35 0,25 0,5 500 3
0,5 0,4 0,3 0,6 500 3
0,1 0,2 0,1 0,2 500 6
0,2 0,25 0,15 0,3 500 6
0,3 0,3 0,2 0,4 500 6
0,4 0,35 0,25 0,5 500 6
0,5 0,4 0,3 0,6 500 6
0,1 0,2 0,1 0,2 500 10
0,2 0,25 0,15 0,3 500 10
0,3 0,3 0,2 0,4 500 10
0,4 0,35 0,25 0,5 500 10
0,5 0,4 0,3 0,6 500 10
0,1 0,2 0,1 0,2 1000 3
0,2 0,25 0,15 0,3 1000 3
0,3 0,3 0,2 0,4 1000 3
0,4 0,35 0,25 0,5 1000 3
0,5 0,4 0,3 0,6 1000 3
0,1 0,2 0,1 0,2 1000 6
0,2 0,25 0,15 0,3 1000 6
0,3 0,3 0,2 0,4 1000 6
0,4 0,35 0,25 0,5 1000 6
0,5 0,4 0,3 0,6 1000 6
0,1 0,2 0,1 0,2 1000 10
0,2 0,25 0,15 0,3 1000 10
0,3 0,3 0,2 0,4 1000 10
0,4 0,35 0,25 0,5 1000 10
0,5 0,4 0,3 0,6 1000 10", header = TRUE)
Create the plot:
df %>%
mutate(across(where(is.character),
~parse_number(.x, locale = locale(decimal_mark = ",")))) %>%
pivot_longer(-c(Ausfallrate, AnzahlJ, AnzahlI),
names_to = "Type",
values_to = "PFC") %>%
mutate(Type = factor(Type, levels = c("PFC_Hot", "PFC_Mode", "PFC_MCA"))) %>%
ggplot(aes(x = Ausfallrate, y = PFC,
group = Type, color = Type,
shape = Type)) +
geom_point() +
geom_line() +
facet_grid(rows = vars(AnzahlJ), cols = vars(AnzahlI),
labeller = labeller(
AnzahlJ = c(`3` = "J=3", `6` = "J=6", `10` = "J=10"),
AnzahlI= c(`100` = "I=100", `500` = "I=500", `1000` = "I=1000")
)) +
scale_color_manual(values = c("blue", "black", "red"),
labels = c("PFC_Hot", "PFC_Mode", "PFC_MCA")) +
scale_shape_manual(values = c(1, 15, 18),
labels = c("PFC_Hot", "PFC_Mode", "PFC_MCA"))
Created on 2023-05-19 with reprex v2.0.2