Search code examples
rubygnuplotheatmapsom

ruby gnuplot heatmap for Self Organizing Map outputs


I'm doing some experiments on SOM in ruby. After getting best match unit, my output is looking like that : x coordinate, y coordinate and distance from the best match unit

I have 200 lines output.

I want to make an heatmap for this output but getting in troubles with gnuplot.

the code I actually use :

Gnuplot.open do |gp|
  Gnuplot::Plot.new( gp ) do |plot|
    plot.title  "heatmap"
    plot.xlabel "x"
    plot.ylabel "y"
    plot.xrange "[0:50]"        
    plot.yrange "[0:50]"        
    plot.terminal "png"
    plot.output "myheatmap.png"
    plot.set "pm3d map"
    plot.set "palette rgbformula -7,2,-7" # for green heatmap
    plot.set "autoscale fix"       
    plot.cbrange "[0.0:0.036]" # range of distance
    plot.cblabel "Score"     
    plot.unset "cbtics"
    plot.data << Gnuplot::DataSet.new( [x,y,dist] ) do |ds|
      ds.with = "image"
      ds.notitle
    end
  end
end

x and y array contains integer (for coordinates) and dist array contains float values.

I've already search here and on gnuplot site. Green color used is inspired from here : Gnuplot: how to write the z values in a heatmap plot

I success in plotting distance with points, but I need more visual plot with heatmap.

Help will be appreciate thanks

edit : Hi Christoph, my data looks like the few lines below :

24,15,1.1214532012163798e-13,
41,41,2.088613932696844e-13,
4,10,1.485599551044706e-13,
0,20,7.981851602311569e-14,
6,46,1.1231898879790176e-13,
8,24,1.2844471889344152e-13,
11,24,2.3794505905958835e-13,
3,16,0.015633285670666745,
3,46,1.238425800407315e-13,
4,20,1.2695729760609708e-13,

I map each column into the appropriate type (int for the first two and float for the last). x and y represent the node coordinates and the last column represent distance between input and the best match unit. I want to plot an heatmap (50x50) and colorize each node depending of the distance (white for short distance, green darker for the biggest distance).

The main problem is that I didn't success to plot what I want with the code pasted below. I guess I can't pasted screen-shot since I've not enough post..

edit 2 : I've also try to round distance column, no changes

edit 3 : I finally transform my output (3 vectors : x, y, distance) into a regular grid.

I've created a two dimensional array filled with 1.0 :

ntab = Array.new(50) { |i| Array.new(50) { |i| 1.0 }}

then I look for corresponding index in x and y vectors, that indicates the node is present in my vectors, and fill the ntab node with corresponding distance value :

(1..50).each do |xv|
  (1..50).each do |yv|
    resx = x.each_with_index.select { |x, idx| x == xv}
    resy = y.each_with_index.select { |y, idy| y == yv}
    resx.each do |row|
      next if row.empty?
      tmp = [yv,row[1]]
      if resy.include?(tmp)
      then 
        ntab[xv-1][yv-1] = dist[row[1]] 
      end
    end
  end
end

I've now a perfect grid (50x50) from which I can plot distance value. I will edit my question with final working code if I find it :)

Thanks Christoph for advices about gnuplot


Solution

  • the result I finally have isn't really what I want but it do the trick to analyze SOM clusters

    fill node array with max value of distance and then plot with the code below (I didn't put link because I need also 10 reputations points to post more than 2 links :/)

    Gnuplot.open do |gp|
      Gnuplot::SPlot.new( gp ) do |plot|
        plot.pm3d
        plot.hidden3d
        plot.palette 'defined (   0 "black", 51 "blue", 102 "green", 153 "yellow", 204 "red", 255 "white" )'
        plot.data << Gnuplot::DataSet.new( ntab ) do |ds|
          ds.with = 'lines'
          ds.matrix = true
        end
      end
    end   
    

    Result