Search code examples
ranalyticscorrelation

How to produce a meaningful draftsman/correlation plot for discrete values


One of my favorite tools for exploratory analysis is pairs(), however in the case of a limited number of discrete values, it falls flat as the dots all align perfectly. Consider the following:

y <- t(rmultinom(n=1000,size=4,prob=rep(.25,4)))
pairs(y)

It doesn't really give a good sense of correlation. Is there an alternative plot style that would?


Solution

  • If you change y to a data.frame you can add some 'jitter' and with the col option you can set the transparency level (the 4th number in rgb):

    y <- data.frame(y)
    pairs(sapply(y,jitter), col = rgb(0,0,0,.2))
    

    enter image description here

    Or you could use ggplot2's plotmatrix:

    library(ggplot2)
    plotmatrix(y) + geom_jitter(alpha = .2)
    

    enter image description here

    Edit: Since plotmatrix in ggplot2 is deprecated use ggpairs (GGally package mentioned in @hadley's comment above)

    library(GGally)
    ggpairs(y, lower = list(params = c(alpha = .2, position = "jitter")))
    

    enter image description here