Search code examples
rhierarchical-clustering

Randomizations and hierarchical tree


I am trying to permute (column-wise only) my data matrix a 1000 times and then do hierarchical clustering in "R" so I have the final tree on my data after 1000 randomizations. This is where I am lost. I have this loop

    for(i in 1:1000) 
    { 
    permuted <- test2_matrix[,sample(ncol(test2_matrix), 12, replace=TRUE)]; (this permutes my columns)
    d = dist(permuted, method = "euclidean", diag = FALSE, upper = FALSE, p = 2);
    clust = hclust(d, method = "complete", members=NULL);
    } 
    png (filename="cluster_dendrogram_bootstrap.png", width=1024, height=1024, pointsize=10) 
    plot(clust)

I am not sure if the final tree is a product after the 1000 randomizations or just the last tree that it calculated in the loop. Also If I want to display the bootstrap values on the tree how should I go about it?

Many thanks!!


Solution

  • The value of clust in your example is indeed the final tree calculated in the loop. Here's a way of making and saving 1000 permutations of your matrix

    make.permuted.clust <- function(i){ # this argument is not used
      permuted <- data.matrix[,sample(ncol(data.matrix), 12, replace=TRUE)]
      d <- dist(permuted, method = "euclidean", diag = FALSE, upper = FALSE, p = 2)
      clust <- hclust(d, method = "complete", members=NULL)
      clust # return value
    }
    
    all.clust <- lapply(1:1000, make.permuted.clust) # 1000 hclust trees
    

    The second part of your question should be answered here.