Search code examples
rgraphicsmeanmultivariate-testing

Multivariate graphics recommendation for R


Could you please recommend the best way to visualize data with four variables in any of the available R packages.

Namely, I have two categorical variables(populations(12) and characters(50)) and two continuous variables (mean and coefficient of variation of each character length measurement for 100 individuals (rows in a matrix)). So it is basically a 12x50x100x100 dimensional graph.

Any suggestions?


Solution

  • I would plot the variables first one by one, then together, starting with the whole population and progressively slicing the data into the various groups.

    # Sample data
    n1 <- 6   # Was: 12
    n2 <- 5   # Was: 50
    n3 <- 10  # Was: 100
    d1 <- data.frame(
      population = rep(LETTERS[1:n1], each=n2*n3),
      character = rep(1:n2, each=n3, times=12),
      id = 1:(n1*n2*n3),
      mean = rnorm(n1*n2*n3),
      var  = rchisq(n1*n2*n3, df=5)
    )
    # Not used, but often useful with ggplot2
    library(reshape2)
    d2 <- melt(d1, id.vars=c("population","character","id"))
    
    # Look at the first variable
    library(lattice)
    densityplot( ~ mean, data=d1 )
    densityplot( ~ mean, groups=population, data=d1 )
    densityplot( ~ mean | population, groups=character, data=d1 )
    
    # Look at the second variable
    densityplot( ~ var, data=d1 )
    densityplot( ~ var, groups=population, data=d1 )
    densityplot( ~ var | population, groups=character, data=d1 )
    
    # Look at both variables
    xyplot( mean ~ var, data=d1 )
    xyplot( mean ~ var, groups=population, data=d1 )
    xyplot( mean ~ var | population, groups=character, data=d1 )
    
    # The plots may be more readable with lines rather than points
    xyplot( 
      mean ~ var | population, groups = character, 
      data = d1, 
      panel = panel.superpose, panel.groups = panel.loess
    )