Search code examples
rcontourbinning

Producing a contour plot from binned data


I'm looking to produce a contour plot from data that I've binned. I have two columns, one that represents the mass of a compound and another is its pearson correlation coefficient value. This is a small example of what I've done so far:-

column1 <- as.numeric(c("100.01", "100.015", "100.017", "100.071", "100.099", "100.111", "100.153", "100.167"))
column2 <- as.numeric(c("0.89", "0.64", "-0.14", "-0.79", "1", "0.31", "-0.27", "0.45"))
test <- cbind(column1, column2)
bin1 <- seq(100, 100.2, by = 0.05)
bin2 <- seq(-1, 1, by = 0.5)
 res <- data.frame(Map(function(x,y) cut(x, breaks=y),
                    as.data.frame(test), list(bin1, bin2)))

 res1 <- cbind(test, res)
 str(res1)
'data.frame':   8 obs. of  4 variables:
 $ column1: num  100 100 100 100 100 ...
 $ column2: num  0.89 0.64 -0.14 -0.79 1 0.31 -0.27 0.45
 $ column1: Factor w/ 4 levels "(100,100.05]",..: 1 1 1 2 2 3 4 4
 $ column2: Factor w/ 4 levels "(-1,-0.5]","(-0.5,0]",..: 4 4 2 1 4 3 2 3

From this I want to produce a contour plot in which the frequency of the values that are binned from the first column are plotted against the frequency of the values plotted in the second column. But, it needs to be done by grouping the bins of the fourth column by the third. So by doing:-

combined <- split(res1[, 4], res1[, 3])
str(combined)
List of 4
 $ (100,100.05]  : Factor w/ 4 levels "(-1,-0.5]","(-0.5,0]",..: 4 4 2
 $ (100.05,100.1]: Factor w/ 4 levels "(-1,-0.5]","(-0.5,0]",..: 1 4
 $ (100.1,100.15]: Factor w/ 4 levels "(-1,-0.5]","(-0.5,0]",..: 3
 $ (100.15,100.2]: Factor w/ 4 levels "(-1,-0.5]","(-0.5,0]",..: 2 3

I want to then produce a plot whereby the frequency of values that fall into the bin range 100, 100.05 is plotted against the frequency of the values that fall into the four separate factor levels. So if 20 values fall into the first bin of 100, 100.05, I want to see how many of those values fall into the bin of -1,-0.5, and then -0.5,0 etc, in turn building a 3D plot. Is there a way of doing this? I know I can do:-

cbind(table(res[, 3])

to get the frequency of the values that fall into the mass bin ranges, I just don't know how to extract the values that fall into the pearson correlation coefficient bin range for a given mass bin range.

Cheers


Solution

  • You can try

    lapply(combined, table)
    

    to get the frequency of bins in the 4th column