I'm currently writing a script to create a Dotplot when two sequences are given. So far I can get a lovely lil dotplot.
The X axis is: >HeaderOfSeq1
X = ATCGTAGCTACGTACGT
The Y axis is: >HeaderOfSeq2
Y = ATGCGATCGTGCTAC
ATGCGATCGTGCTAC
===============|
\ \ \ |A
\ \ \ \ |T
\ \ \ \|C
\ \ \ \ |G
\ \ \ \ |T
\ \ \ |A
\ \ \ \ |G
\ \ \ \|C
\ \ \ \ |T
\ \ \ |A
\ \ \ \|C
\ \ \ \ |G
\ \ \ \ |T
\ \ \ |A
\ \ \ \|C
\ \ \ \ |G
\ \ \ \ |T
This is with an --ascii filter (without that filter the / are the letters that are matched) that is also part of the script. No what I want and need to do is turn this into a matplotlib plot.
I am kinda stuck at this point, i've meshgrid from np to get two arrays with al possible combinations and I was hoping it would be fairly simple to overlap and return a contour graph maybe that essentially shows the above dot plot but just much prettier. Matplot is a requirement btw, standardisation and all that. I can't do anything with the meshgrids (that i know of anyway) due to their string format so i'm stuck.
Any help would be greatly appreciated!! I'll also post some of the actual code if needed too.
IIUC, you can do:
X, Y = 'ATCGTAGCTACGTACGT', 'ATGCGATCGTGCTAC'
X, Y = np.array(list(X)), np.array(list(Y))
plt.imshow(X==Y[:,None]) # the magic happens here, contourf should work similarly
plt.xticks(np.arange(len(X)), X)
plt.yticks(np.arange(len(Y)), Y)
plt.show()
Output: