seqinr dotplot - change axis

I have to datasets: seq1 and seq2 (DNA sequences). I wanted to do a dataplot, comparing the two sequences and placing a dot where the two sequences match. I was able to accomplish this using seqinr's dotplot, but what I am unable to do is to list the sequences on the axis, so you can see which dots matching. Essentially, I want to replace the numbers with the sequence letters.

Is there anyway to do this? Maybe through ggplot2?

Dotplot

These are my sequences:

seq1 <- c("G","C","T","A","G","T","C","A","G","A","T","C","T","G","A","C","G","C","T","A")
seq2 <- c("G","A","T","G","G","T","C","A","C","A","T","C","T","G","C","C","G","C")

And this is how I generated this graph:

dotPlot(seq1, seq2, main = "Dot plot of 2 different sequences
\nwsize = 4, wstep = 1, nmatch = 3", wsize = 4, wstep = 1, nmatch = 3)

Solution

How about a modified version (aka "a quick-and-dirty hack") of the dotplot-function? First, copy and paste the following code into R:

dotplot<-function (seq1, seq2, wsize = 1, wstep = 1, nmatch = 1, cols = c("white", 
    "black"), xlab = deparse(substitute(seq1)), ylab = deparse(substitute(seq2)), type=2,
    ...) {

    cat("This is a modification of the function dotPlot from package seqinr.\n")

    require(seqinr)

    if (nchar(seq1[1]) > 1) 
        stop("seq1 should be provided as a vector of single chars")
    if (nchar(seq2[1]) > 1) 
        stop("seq2 should be provided as a vector of single chars")
    if (wsize < 1) 
        stop("non allowed value for wsize")
    if (wstep < 1) 
        stop("non allowed value for wstep")
    if (nmatch < 1) 
        stop("non allowed value for nmatch")
    if (nmatch > wsize) 
        stop("nmatch > wsize is not allowed")
    mkwin <- function(seq, wsize, wstep) {
        sapply(seq(from = 1, to = length(seq) - wsize + 1, by = wstep), 
            function(i) c2s(seq[i:(i + wsize - 1)]))
    }
    wseq1 <- mkwin(seq1, wsize, wstep)
    wseq2 <- mkwin(seq2, wsize, wstep)
    if (nmatch == wsize) {
        xy <- outer(wseq1, wseq2, "==")
    }
    else {
        "%==%" <- function(x, y) colSums(sapply(x, s2c) == sapply(y, 
            s2c)) >= nmatch
        xy <- outer(wseq1, wseq2, "%==%")
    }

    if(type==1) {
       image(x = seq(from = 1, to = length(seq1), length = length(wseq1)), 
           y = seq(from = 1, to = length(seq2), length = length(wseq2)), 
           z = xy, col = col, xlab = xlab, ylab = ylab, axes=F, ...)
       box()
    }

    colnames(xy)<-wseq2
    rownames(xy)<-wseq1

    xy2<-matrix(nrow=length(seq1), ncol=length(seq2), data=FALSE)
    rownames(xy2)<-seq1
    colnames(xy2)<-seq2
    ind<-which(xy, arr.ind=T)
    xy2[ind]<-TRUE
    ind<-data.frame(ind, row.names=NULL)    

    res<-data.frame(row=c(), col=c())
    for(i in 1:nrow(ind)) {
       DF<-data.frame(row=seq(from=ind$row[i], to=ind$row[i]+wsize-1),
                      col=seq(from=ind$col[i], to=ind$col[i]+wsize-1))
       res<-rbind(res, DF)
    }
    xy2[as.matrix(res)]<-TRUE

    if(type==2) {
       image(x = seq(from = 1, to = length(seq1)), y = seq(from = 1, to = length(seq2)), z = xy2, col = cols, xlab = xlab, ylab = ylab, axes=F, ...)
       box()
       axis(side=1, at=1:length(seq1), labels=seq1)
       axis(side=2, at=1:length(seq2), labels=seq2, las=1)
    }

    out<-list(type1=xy, type2=xy2)
    return(out)
}

Then running the example you provided:

seq1 <- c("G","C","T","A","G","T","C","A","G","A","T","C","T","G","A","C","G","C","T","A")
seq2 <- c("G","A","T","G","G","T","C","A","C","A","T","C","T","G","C","C","G","C")
xy<-dotplot(seq1, seq2, wsize = 4, wstep = 1, nmatch = 3)

should produce the following plot:

enter image description here

In short, I expanded the function a little to produce a full matrix between the two sequences. If you inspect the output object xy, you'll see the original (type1) matrix, and the expanded one (type2). For very long sequences this modification is not efficient, not to mention that the nucleotide/amino acid labels on the axes will overlap each other. You can change the plot type between type1 and type2 using the new argument type.