I am currently working on R, and I have some troubles with the boxplot from the package ggplot2.
What I want to do is to plot the NO2 concentration depending on the speed of the vehicles on the road. So I have a continuous x-axis and a continuous y-axis. When I use geom_boxplot, I get those graphs
ggplot(df, aes(x=Speed, y=Concentration)) +
geom_boxplot() +
scale_x_continuous(limits = c(0, 100)) +
scale_y_continuous(limits = c(0,500))
We can see that the boxes are randomly disposed on this graph What I want is to get a different boxplot every 20 km/h between 0 and 100 km/h.
I have tried different things seen on other topics from the forum, like :
aes(group = cut_width(Speed, 20))
but nothing is changing and my boxes won't be positioned every 20 km/h.
I am not sure that my explanations are very clear, please do not hesitate to ask if you don't understand something.
It's been a few days that I'm trying to solve this problem, and I would be very grateful if someone could help me on that issue.
Thank you, Valentine
Edit : Here is a code to create a dataset, and a picture of the result.
df = data.frame(matrix(ncol = 2, nrow = 20))
colnames(df) = c("Speed", "Concentrations")
df$Speed = runif(20, 0,100)
df$Concentrations = runif(20,0,500)
ggplot(df, aes(x = Speed, y = Concentrations)) + geom_boxplot(aes(group = cut_width(Speed, 20)))
The result is here. What I want is to have a box at Speed 20, 40, 60, 80.
Consider adding the following discrete variable to your data instead of applying cut_width() in your ggplot commands:
df$Speed_Cat = cut_width(df$Speed, 20)
Then your plot will be constructed via:
ggplot(df, aes(x = Speed_Cat, y = Concentrations)) +
geom_boxplot() +
scale_x_discrete(labels=seq(0,100,20))
Just know what you want your cuts to represent! Buckets become [-10,10], (10,30], ..., but you can always adjust these when you create the variable in your data.