Search code examples
rplotbioinformaticspca

Colorful MDS plot in edgeR


I have problem to create MDS plot in edgeR to visualize experimental (leukemia) and control (healthy donors) groups in color.

I used htseq files as an input for edgeR. Each file consists of two columns - gene_ID and read counts. "A" stands for leukemia patients, "H" stands for healthy donors.

Here is my code:

Create a table:

samples <- matrix(c("A18.txt","experiment","blood_exp",
        "A19.txt","experiment","blood_exp",
        "A20.txt","experiment","blood_exp",
        "A23.txt","experiment","blood_exp",
        "A24.txt","experiment","blood_exp",
        "A26.txt","experiment","blood_exp",
        "A30.txt","experiment","blood_exp",
        "A37.txt","experiment","blood_exp",
        "H11.txt","control","blood_control",
        "H12.txt","control","blood_control",
        "H13.txt","control","blood_control",
        "H15.txt","control","blood_control",
        "H16.txt","control","blood_control",
        "H17.txt","control","blood_control",
        "H18.txt","control","blood_control",
        "H19.txt","control","blood_control"),
    nrow = 16, ncol = 3, byrow = TRUE, dimnames = list(c(1:16), c("library_name","condition","group_ALL_vs_control")))

samples <- as.data.frame (samples, row.names = NULL, optional = FALSE, stringAsFactors = default.stringAsFactors())

Use the edgeR function, readDGE, to read in the READS COUNT files created frou htseq-count:

counts <- readDGE(samples$library_name, path = 'C:/Users/okbm4/Desktop/htseq_files', columns=c(1,2), group = samples$group_ALL_vs_control, header = FALSE)

colnames(counts) <- samples$library_name

Filter weakly expressed and noninformative (i.e.amibigous) features:

noint <- rownames(counts) %in% c('__no_feature','__ambiguous','__too_low_aQual','__not_aligned','__alignment_not_unique')

cpms <- cpm(counts)
keep <- rowSums (cpms > 1) >= 4 & !noint
counts <- counts[keep,]

Create a DGElist object

counts <- DGEList(counts=counts,group = samples$group_ALL_vs_control)

Estimate normalization factor, this is the normalization for the library size

counts <- calcNormFactors(counts)

Inspect relationships between samples using MDS plot.

pdf(file = 'HCB_ALL.pdf', width = 9, height = 6)

plotMDS(counts, labels = c('A18.txt','A19.txt','A20.txt','A23.txt','A24.txt','A26.txt','A30.txt','A37.txt','H11.txt','H12.txt','H13.txt','H15.txt','H16.txt','H17.txt','H18.txt','H19.txt'), 

xlab = 'Dimension 1',
ylab = 'Dimension 2', 
asp = 6/9,
cex = 0.8,

main = 'Multidimentional scaling plot')
par(cex.axis =0.6, cex.lab = 0.6, cex.main = 1)

I attached file that I generate before. enter image description here

I would be glad to hear any suggestions.


Solution

  • plotMDS() produces an object that can be passed to plot() just as it is, so that you can choose your own plotting symbols and x and y axis labels:

     mds <- plotMDS(yourdata)
       plot(mds)
    

    You can add any arguments to plot() to choose plotting symbols, colors etc.