I have two variables (V1, V2) which I need to plot against each other in a simple scatter plot. Some rows are missing either V1 or V2 so will not be included on a plot, but the remaining information in these rows is still of interest.
So I tried substituting the NAs with a value outside of the data range and adding an 'NA' label on the axes but the requirement of 'breaks' and 'labels' to be the same length causes additional grid lines.
Is it possible to have an axis label without a break? Any advice gratefully received!
Apologies that I can't post an image to illustrate my issue as I'm new to stackoverflow. Hopefully the code and link below will be enough.
# Simulated example data
library(ggplot2)
set.seed(112)
DF<-data.frame(V1=rnorm(20,10,4))
DF$V2<-DF$V1+rnorm(20,0,1)
DF[sample(1:dim(DF)[1],2),]$V1<-NA
DF[sample(1:dim(DF)[1],2),]$V2<-NA
# plot with NA rows removed
ggplot(DF,aes(x=V1,y=V2))+geom_point()+theme_bw()
# substitute NAs with value outside data range
DF$WasNA<-apply(DF,1,function(x)any(is.na(x)))
DF[is.na(DF$V1),]$V1<- -1
DF[is.na(DF$V2),]$V2<- -1
(p<-ggplot(DF,aes(x=V1,y=V2,colour=WasNA))+
geom_point()+
scale_colour_manual(values=c("black","grey70"))+
theme_bw())
p+
scale_x_continuous(breaks=c(-1,ggplot_build(p)$layout$panel_params[[1]]$x.major_source),labels=c("NA",ggplot_build(p)$layout$panel_params[[1]]$x.labels))+
scale_y_continuous(breaks=c(-1,ggplot_build(p)$layout$panel_params[[1]]$y.major_source),labels=c("NA",ggplot_build(p)$layout$panel_params[[1]]$y.labels))
(As an additional point of interest, I'm not certain why the extra break I add in is mirrored at the upper end of the scales too?)
If you're using a plot design with a background grid, then I think there needs to be a grid line at the NA
position. Otherwise the plot would look weird.
So my recommendation would be to get rid of the minor grid lines. That removes the problem of the weird additional lines that shouldn't be there.
p + scale_x_continuous(breaks=c(-1, ggplot_build(p)$layout$panel_params[[1]]$x.major_source),
labels=c("NA", ggplot_build(p)$layout$panel_params[[1]]$x.labels)) +
scale_y_continuous(breaks=c(-1, ggplot_build(p)$layout$panel_params[[1]]$y.major_source),
labels=c("NA", ggplot_build(p)$layout$panel_params[[1]]$y.labels)) +
theme(panel.grid.minor = element_blank())
If you want more grid lines, you could always define additional breaks (say at positions 2.5, 7.5, 12.5) and give them an empty label. This will simulate minor grid lines but at exactly the locations you want.