Search code examples
rplotkernel-densitydensity-plot

R: How to have a violin plot, box-percentile plot that takes into account sample size?


To use a simple example, let's say:

A = rnorm(10)
B = rnorm(100)
C = rnorm(500)

library(vioplot)
vioplot(A,B,C)

My question is thus how to create such a graph that takes into account the sample size. 'C' has a much higher sample size than 'A', is there a way where the violin plot for 'C' can show a "bigger" violin than 'A'? Thus this would be density distribution that goes across the three classes I suppose, thus even though the entire distribution shape of 'A' and 'C' may be equal, rather than showing identical images they show 'A' being of smaller shape stature than 'C' and 'B' as well due to its smaller sample size.


Solution

  • It's unfortunate that vioplot doesn't accept vectors for some of its parameters. Here's a workaround. The helpful features in vioplot() for this workaround are the at and wex parameters along with add=T. Basically plot each violin individually with the parameters that shape them the way you want. You may need to make adjustments to the way you scale the sample size for use with wex.

    n<-c(100,1000)
    size<-scale(sqrt(n),center=F)
    
    x1<-rnorm(n[1])
    x2<-rnorm(n[2])
    
    #initialize an empty plot
    plot(0:3,rep(0,4),type='l',xlim=c(0,3),ylim=c(-4,4),ylab="",xlab="",xaxt="n",lty=3)
    
    # fill in the violins at specific x locations using the `wex` parameter for size
    vioplot(x1,at=1,wex=size[1],add=T,col="darkgray")
    vioplot(x2,at=2,wex=size[2],add=T,col="darkgray")
    axis(1,at=1:2,labels=c("Mon","Tues"))
    

    enter image description here