I'm dealing with text files and vectors.
I have a space separated text file with the following format:
id1 AA 44 AG 20 GG 36
id2 CC 30 CT 22 TT 48
id3 CT 60 CC 30 TT 10
...
And I need a code that loops through each line and put the id in a variable and the rest of the values in a vector. Example of a vector corresponding to the first line:
x <- id1
y <- c(AA=40,AG=20,GG=36)
Edit: I need to use HWChisq function from HardyWeinberg package to exclude SNPs that have p-value < 0.001. Function requires named vector of counts for each allele.
Loop through row by row, then apply the HWE function:
library("HardyWeinberg")
# data
df1 <- read.table(text = "
id1 AA 44 AG 20 GG 36
id2 CC 30 CT 22 TT 48
id3 CT 60 CC 30 TT 10", header = FALSE, stringsAsFactors = FALSE)
out <- apply(df1[, c(3, 5, 7)], 1, function(i){
x <- HWChisq(setNames(i, c("AA", "AB", "BB")), verbose = FALSE)
x$pval
})
# [1] 5.774374e-09 1.182236e-07 7.434226e-02
Pretty output:
cbind(df1, HWE = out)
# V1 V2 V3 V4 V5 V6 V7 HWE
# 1 id1 AA 44 AG 20 GG 36 5.774374e-09
# 2 id2 CC 30 CT 22 TT 48 1.182236e-07
# 3 id3 CT 60 CC 30 TT 10 7.434226e-02
To calculate the HWE for X-chromosome see vignette:
4. X-chromosomal tests for Hardy-Weinberg equilibrium
Recently, Graffelman and Weir (2016) have proposed specific tests for HWE for bi-allelic markers on the X-chromosome. These tests take both males and females into account. The X-chromosomal tests can be carried out by the same functions mentioned in the previous Section (HWChisq, HWLratio, HWExact, HWPerm) and adding the argument
x.linked=TRUE
to the function call.