I think this might be an easy question, but I could not solve it after reading the pegas documentation. I want to plot an haplotype network using a FASTA file and identify which mutations are separatting the distinct haplotypes.
Example:
fa <- read.FASTA("example.fa")
haps <- haplotype(fa)
haps50 <- subset(haps, minfreq = 50)
(network <- haploNet(haps50))
plot(network, size = attr(network, "freq"),show.mutation=1,labels=T)
How can I identify the position of the mutation in my FASTA file that is separating for example haplotype XX
from V
?
Extra question:
Would it be also possible to know for example, what is the haplotype sequence of one of the haplotypes? For example haplotype V
, which is quite frequent?
The pegas
package contains a function diffHaplo
, which specifies differences between haplotypes.
diffHaplo(haps50, a = "XX", b = "V")
To extract the DNA sequence of the frequent haplotype V
, the indexes in the object of class haplotype
will identify, which DNA sequence contains the haplotype.
# haplotype index from its name
i <- which(attr(haps50, "dimnames")[[1]] == "V")
# index of the first sequence with the haplotype
s <- attr(haps50, "index")[[i]][1]
The respective sequence can then be identified in the alignment fa
to save or print on screen.
write.dna(fa[s], file = "hapV.fas", format = "fasta", nbcol = -1, colsep = "")
paste(unlist(as.character(fa[s])), collapse = "")