I want generate a 2D plot like this from X,Y,Z data. I have example data that looks like this:
0 1 C
0 2 G
0 3 T
1 2 C
1 1 H
1 3 G
2 1 T
2 2 C
2 3 G
But the problem here is that Z is in the form of characters and I want specific color for each character like the one shown in here. Thanks in advance.
One possibility is that you make an image out of it:
import matplotlib.pyplot as plt
import numpy as np
# the input data is in a list of tuples (x, y, z)
# where x and y are coordinates in (0..n-1) and z a base type in TCAG
indata = [
(0,1,'C'),
(0,2,'G'),
(0,3,'T'),
(1,2,'C'),
(1,1,'H'),
(1,3,'G'),
(2,1,'T'),
(2,2,'C'),
(2,3,'G') ]
# you want to have a color for each character:
colordict = {
'C': (1, 1, 0, 1),
'G': (0, 1, 0, 1),
'A': (1, 0, 0, 1),
'T': (0, 1, 1, 1),
'H': (1, 0, .5, 1) }
# find the maximum positions in x and y
xmax = max(indata, key=lambda p: p[0])[0]
ymax = max(indata, key=lambda p: p[1])[1]
# create an image
img = np.zeros((ymax+1, xmax+1, 4))
# populate the image with the correct colors
for p in indata:
img[p[1], p[0]] = colordict[p[2]]
# show the color map:
fig = plt.figure()
ax = fig.add_subplot(111)
ax.imshow(img, aspect='auto', interpolation='nearest', origin='lower')
This gives:
Of course, you'll need to do something with the axes (labeling, scale, ticks, etc.) but depends on your needs. Now you have some colors defined by the characters in the coordinates you wanted to have them. (The colors are defined in RGBA, where A is opacity.)
BTW, if you like a bit more control, then pcolor
or pcolormesh
, latter being faster than former are worth having a look at. (ìmshow
is still faster but more constrained.)