I am producing a heat map from this table
x
. 0 1 2 3 4 5 6 7 8
0 12820 720 807 879 1051 824 587 732 874
1 557 38417 41289 44380 57301 42992 30805 41092 45616
2 62 59575 83247 72433 95751 76113 50002 92921 72773
3 23 45346 100836 57101 57903 50625 35223 52695 47868
4 4 14718 40000 13135 5985 13188 19252 8044 7095
5 0 4459 6828 674 890 5251 4959 399 563
6 0 216 333 142 115 189 202 128 128
7 0 188 97 30 57 255 19 51 29
8 1 20 38 13 7 4 11 44 17
9 0 11 9 1 0 8 12 102 6
10 7992 9620 16841 11065 9917 9619 8133 9291 8554
9 10
0 6 10804
1 55 26041
2 33 45915
3 17 35198
4 1 10071
5 0 2092
6 0 102
7 0 29
8 0 1
9 1 1
10 487 1319070
With this code
hv <- heatmap.2(x, col=cm.colors(255), scale="none",trace='none', density="none",lmat=rbind( c(0, 3), c(2,1), c(0,4) ), lhei=c(1.5, 4, 2 ),Rowv = FALSE, Colv = FALSE,dendrogram = "none",margins=c(5,5),xlab="ICOADS ship speed indicator",ylab="ICOADS course indicator")
I produce this plot
which is not very useful. Anyway, if I don't want to put scale ="row" or scale="column" because it will be hard to interpret the result. Is there a way of using scale="none" and seeing different ranges of colours?
Thanks
Well, here's one way using ggplot
.
library(reshape2) # for melt
library(RColorBrewer) # for brewer.pal(...)
library(ggplot2)
x <- cbind(id=as.numeric(rownames(x)),x)
gg <- melt(x,id="id")
gg$variable <- as.numeric(substr(gg$variable,2,4))
brks <- c(0,10^rep(0:7))
gg$bin <- cut(gg$value,breaks=brks,labels=0:7,include.lowest=T)
ggplot(gg) +
geom_tile(aes(x=factor(id),y=factor(10-variable),fill=bin))+
scale_fill_manual(name="",labels=brks,values=rev(brewer.pal(8,"Spectral")))+
scale_y_discrete(labels=10:0)+
labs(x="",y="")+
theme_bw()+theme(panel.border=element_blank())
The basic idea is to use a logarithmic scale for the colors. This is a bit of a problem for you because you have zeros. So a work-around sets bins using the cut(...)
function. The bin breaks are set to 0,1,10,100,...,1e7.
Your data is in "wide" format: different values of "y" in different columns, whereas ggplot
needs the data in "long" format: all the data in one column with x-values and y-values in separate columns. We convert wide to long using the melt(...)
function. The rest of the code sets up the bins and the labels, and formats the plot to make it look as close to yours as I could manage.