Search code examples
rcluster-analysisbinary-matrix

Biclustering in R


I want to apply byclustering on a binary matrix in R. There is a nice package called "biclust" available, but it does and displays not everything that I want.

I have a binary matrix which looks like the following:

1 0 0 1 0 1 0
0 0 0 0 0 0 0
0 0 1 0 1 0 0
1 0 0 1 0 1 0
0 0 1 0 1 0 0
1 0 0 1 0 1 0
0 0 0 0 0 0 0

And my goal is to bicluster (and display) this as following (may be colored):

1 1 1 0 0 0 0
1 1 1 0 0 0 0
1 1 1 0 0 0 0
0 0 0 1 1 0 0
0 0 0 1 1 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0

Set up code:

# install.packages("biclust") (if necessary)
library("biclust")

testMatrix <- matrix(c(1,0,0,1,0,1,0,
                       0,0,0,0,0,0,0,
                       0,0,1,0,1,0,0,
                       1,0,0,1,0,1,0,
                       0,0,1,0,1,0,0,
                       1,0,0,1,0,1,0,
                       0,0,0,0,0,0,0),
                     nrow = 7,
                     ncol = 7,
                     byrow = TRUE)

I applied the biclust function of the "biclust" R package:

testCluster <- biclust(x = testMatrix, method=BCBimax())

and indeed I get the two clusters expected:

An object of class Biclust 
call:
biclust(x = testMatrix, method = BCBimax())
Number of Clusters found:  2 
First  2  Cluster sizes:
                      BC 1  BC 2
Number of Rows:       3     2
Number of Columns:    3     2

I can both display the clusters separately by:

drawHeatmap(x = testMatrix, bicResult = testCluster, number = 1) # shown in picture below
drawHeatmap(x = testMatrix, bicResult = testCluster, number = 2)

Picture

and I can display the entire clustered matrix (one cluster at upper left corner) by:

drawHeatmap2(x = testMatrix, bicResult = testCluster, number = 1) # shown in picture below
drawHeatmap2(x = testMatrix, bicResult = testCluster, number = 2)

Picture

So far so good, but I want:

  1. Colors of display switched. Now the 1 is red and the 0 is green.
  2. I want to see the rows and columns of the original matrix. Now there are shown just the row numbers and column numbers of the specific cluster (with drawHeatMap) and there are shown no row and column numbers at the entire clustered matrix (drawHeatMap2).
  3. I want a nicely ordered clustered matrix. Now only the cluster specified in drawHeatmap2 is shown in the upper left corner, but for the rest of the matrix I also want the other clusters nicely ordered from the upper left corner to the lower right corner for the rest of the matrix.

Are these changes possible (with the "biclust" package)? Or is it better to do it in another way with R?


Solution

  • Change the drawHeatmap() funtion in the biclust source packag package:

    1. trace("drawHeatmap", edit = TRUE)
    2. Change the following:
      (a) Switch red and green - switch the rvect and gvect in call rgb()
      (b) Original rownames instead of new - change 'labels=' to '=bicCols' and '=bicRows'.
    3. Print rownumbers: before axis about rows: cat(bicRows).
    4. Save rownumbers to file - before axis about rows: write(bicRows, file="FILENAME.txt")