Search code examples
rggplot2violin-plot

Creating violin plots on ggplot2 for vectors of different length?


I have two vectors of different length, and want to create a violin plots for them. What I am currently doing is to cbind them, which makes the shorter vector to be repeated until it matches the length of the longer vector (by default done by cbind in R).

library(ggplot2)

C1 <- rnorm(100)
C2 <- rnorm(500)

dat <- cbind(C1,C2)

# Violin plots for columns
mat <- reshape2::melt(data.frame(dat), id.vars = NULL)
pp <- ggplot(mat, aes(x = variable, y = value)) + geom_violin(scale="width",adjust = 1,width = 0.5,fill = "gray80")
pp

Would this affect the shape of the violin? Is there a more correct way of creating the violin plots without having to artificially increase the length of one of them?


Solution

  • Rather than cbinding two vectors with different lengths, which will cause recycling, and then melting, make two data frames where you mark what each represents and rbind them. That way you start out with data in the shape that ggplot expects, and don't run the risk of repeating values from the shorter of the two sets of data.

    library(ggplot2)
    
    set.seed(710)
    C1 <- data.frame(value = rnorm(100), variable = "C1")
    C2 <- data.frame(value = rnorm(500), variable = "C2")
    
    dat <- rbind(C1, C2)
    head(dat)
    #>         value variable
    #> 1 -0.97642446       C1
    #> 2 -0.51938107       C1
    #> 3  1.05793223       C1
    #> 4 -0.88139935       C1
    #> 5 -0.05997154       C1
    #> 6  0.31960235       C1
    
    ggplot(dat, aes(x = variable, y = value)) +
      geom_violin(scale = "width", adjust = 1, width = 0.5)
    

    Created on 2018-07-11 by the reprex package (v0.2.0).