I have been trying to create a graph that has non-linear (and non-log) based scaling on the axis. Ideally the graph would not be discontinuous. It is hard to explain so I will show it with pictures. My current graph uses a ln transformed scale to give:
The problem is most of my data is log normally skewed, and I would ideally like to have the large majority of the data centered on the graph. If I could do this perfectly, the axis would scale like:
Upper 20% of graph = 1,001-40,000
Mid 30% of graph = 201-1,000
Lower 50% of graph = 1-200
To attempt this, I have tried the package: gg.gap. I thought using a discontinuous axis would be good, but that introduces white spaces. From what I can tell, these cannot be minimized. I have also tried to facet three graphs vertically. Using cowplots I have achieved this:
This is much closer to what I want, but the problem is that the white space still exist, and the way that the plot margins work, it ends up cutting some data point off in half, leaving weird half circles at the extremities. - Note: I fixed this now with " coord_cartesian(clip = "off")", the points are no longer clipped.
To solve this I am at a loss and thought I would reach out for some help. Here is a minimal reproducible code (still long but it shows everything that produces each graph).
#Generate Random Data set:
set.seed(1)
graphdata <-as.data.frame(rlnorm(29000, meanlog = 4.442651, sdlog = 0.85982))
colnames(graphdata)[1] <- "values"
high_values <- as.data.frame(rlnorm(1000, meanlog = 9.9, sdlog = 0.85))
colnames(high_values)[1] <- "values"
graphdata <- rbind(graphdata,high_values)
graphdata$values <- round(graphdata$values, digits = 0)
#Current Plot
#I used 'Trans = 'log'' to set the axis to a natural log scale. It has worked until my data have become large enough that I need to change the scaling of the axis for better visibility.
library(tidyverse)
graph <- ggplot(graphdata, aes(y = values, x=1))+
geom_jitter(aes(colour = cut(values, c(0,100,200,Inf))), alpha = 0.2, size = 0.5)+
scale_color_manual(values = c("#F0CF19","#F07C0B", "#D52828"))+
scale_y_continuous(breaks = c(0,20,50,100,200,500,1000,2000,5000,10000,20000,40000), trans='log', limits = c(14, 40001), expand = c(0, 0))+
labs(y = "Values", x = NULL)+
scale_x_continuous(expand = c(0.01, 0))+ coord_cartesian(clip = "off")+
theme_classic()+
theme(legend.position = "none",
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.line.x = element_blank())
graph
#My attempt to get an altered axis arrangement - using multiple plots and stacking them:
graph1 <- ggplot(graphdata, aes(y = values, x=1))+
geom_jitter(aes(colour = cut(values, c(0,100,200,Inf))), alpha = 0.2, size = 0.5)+
scale_color_manual(values = c("#F0CF19","#F07C0B", "#D52828"))+
scale_y_continuous(breaks = c(0,20,50,100,200), trans='log', limits = c(14, 200), expand = c(0, 0))+
labs(y = NULL, x = NULL)+
scale_x_continuous(expand = c(0.01, 0))+
theme_classic()+ coord_cartesian(clip = "off")+
theme(legend.position = "none",
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.line.x = element_blank(),
plot.margin = margin(t=0, unit = "pt"))
graph1
graph2 <- ggplot(graphdata, aes(y = values, x=1))+
geom_jitter(aes(colour = cut(values, c(0,100,200,Inf))), alpha = 0.2, size = 0.5)+
scale_color_manual(values = c("#F0CF19","#F07C0B", "#D52828"))+
scale_y_continuous(breaks = c(500,1000), trans='log', limits = c(201, 1000), expand = c(0, 0))+
labs(y = "Values", x = NULL)+
scale_x_continuous(expand = c(0.01, 0))+
theme_classic()+ coord_cartesian(clip = "off")+
theme(legend.position = "none",
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.line.x = element_blank(),
plot.margin = margin(t=0, unit = "pt"))
graph2
graph3 <- ggplot(graphdata, aes(y = values, x=1))+
geom_jitter(aes(colour = cut(values, c(0,100,200,Inf))), alpha = 0.2, size = 0.5)+
scale_color_manual(values = c("#F0CF19","#F07C0B", "#D52828"))+
scale_y_continuous(breaks = c(10000,20000,40000), trans='log', limits = c(1001, 40001), expand = c(0, 0))+
labs(y = NULL,x = NULL)+
scale_x_continuous(expand = c(0.01, 0))+
theme_classic()+ coord_cartesian(clip = "off")+
theme(legend.position = "none",
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.line.x = element_blank(),
plot.margin = margin(t=0, unit = "pt"))
graph3
#Using Cowplot, I stichted together three panels to make one graph that is close to what I want. But the problem lies with the white space between the panels. I want to get rid of it. Also, this method leads to some points being cut-off, leaving wierd half circles.
library(cowplot);library(grid); library(egg)
graph4 <- cowplot::plot_grid(graph3, graph2, graph1, align = "v", ncol = 1, rel_heights = c(0.25,0.25,0.5))
graph4 <- set_panel_size(graph4, width = unit(7, "cm"), height = unit(6, "cm"))
grid.newpage()
grid.draw(graph4)
Thanks!
It looks like I needed to add in a layer of 'NULL' graphs to adjust the spacing of the white space. I made sure that my plot margins were 0 for top and bottom and then added in a NULL graph and adjusted the spacing to be negative. Here is the adjusted code:
#Generate Random Data set:
set.seed(1)
graphdata <-as.data.frame(rlnorm(29000, meanlog = 4.442651, sdlog = 0.85982))
colnames(graphdata)[1] <- "values"
high_values <- as.data.frame(rlnorm(1000, meanlog = 9.9, sdlog = 0.85))
colnames(high_values)[1] <- "values"
graphdata <- rbind(graphdata,high_values)
graphdata$values <- round(graphdata$values, digits = 0)
graph1 <- ggplot(graphdata, aes(y = values, x=1))+
geom_jitter(aes(colour = cut(values, c(0,100,200,Inf))), alpha = 0.2, size = 0.5)+
scale_color_manual(values = c("#F0CF19","#F07C0B", "#D52828"))+
scale_y_continuous(breaks = c(0,20,50,100,200), trans='log', limits = c(14, 200), expand = c(0, 0))+
labs(y = NULL, x = NULL)+
scale_x_continuous(expand = c(0.01, 0))+
theme_classic()+
coord_cartesian(clip = "off")+
theme(legend.position = "none",
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.line.x = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
plot.margin = margin(0,0,0,0))
graph1
graph2 <- ggplot(graphdata, aes(y = values, x=1))+
geom_jitter(aes(colour = cut(values, c(0,100,200,Inf))), alpha = 0.2, size = 0.5)+
scale_color_manual(values = c("#F0CF19","#F07C0B", "#D52828"))+
scale_y_continuous(breaks = c(500,1000), trans='log', limits = c(201, 1000), expand = c(0, 0))+
labs(y = "Longer title Values", x = NULL)+
scale_x_continuous(expand = c(0.01, 0))+
theme_classic()+
coord_cartesian(clip = "off")+
theme(legend.position = "none",
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.line.x = element_blank(),
axis.title.x = element_blank(),
plot.margin = margin(0,0,0,0))
graph2
graph3 <- ggplot(graphdata, aes(y = values, x=1))+
geom_jitter(aes(colour = cut(values, c(0,100,200,Inf))), alpha = 0.2, size = 0.5)+
scale_color_manual(values = c("#F0CF19","#F07C0B", "#D52828"))+
scale_y_continuous(breaks = c(10000,20000,40000), trans='log', limits = c(1001, 40001), expand = c(0, 0))+
labs(y = NULL,x = NULL)+
scale_x_continuous(expand = c(0.01, 0))+
theme_classic()+
coord_cartesian(clip = "off")+
theme(legend.position = "none",
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.line.x = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
plot.margin = margin(0,0,0,0))
graph3
#Using Cowplot, I stichted together three panels to make one graph that is close to what I want. But the problem lies with the white space between the panels. I want to get rid of it. Also, this method leads to some points being cut-off, leaving wierd half circles.
library(cowplot);library(grid)
graph4 <- cowplot::plot_grid(graph3,NULL, graph2, NULL,graph1, align = "v", ncol = 1, rel_heights = c(0.25,-0.01,0.25,-0.01,0.5)) #adjust spacing with the negative values here.
graph4 <- set_panel_size(graph4, width = unit(7, "cm"), height = unit(6, "cm"))
grid.newpage()
grid.draw(graph4)
Overall I am happy that I can adjust the spacing now. However, this brings in another issue of axis title. It looks like the layering makes the last listed graph as the top most graph. The title from the middle graph is cutoff from the bottom graph. Here is what I mean (title is supposed to read "longer title Values"):
So that is the next hurdle, but may be another question that I have to ask in a separate post.