Search code examples
rsplitdelimiter

Split a column of values delimited by colons into separate columns for each value


I have a table of stings and numbers as below:

           V1                  V2
1  GT:AD:DP:GQ:PL  0/1:10,45:55:70:106,0,70
2  GT:AD:DP:GQ:PL  1/1:2,42:44:16:288,16,0
3  GT:AD:DP:GQ:PL  1/1:3,37:40:14:147,14,0
4  GT:AD:DP:GQ:PL  0/1:7,50:57:55:250,0,55

For vector V2, I would like to split the ':'- delimited (colon-delimited) values into separate columns for each value, e.g.:

   V1              V2   V3     V4  V5  V6
1  GT:AD:DP:GQ:PL  0/1  10,45  55  70  106,0,70

Solution

  • call that table vcf

    vcf.info <- data.frame(t(sapply(vcf[,2], function(y) strsplit(y,split=":")[[1]])))
    

    then cbind that with the original vcf column(s) that you want

    vcf.info2 <- cbind(vcf[,1],vcf.info)
    

    but in a real vcf I would

    vcf.info2 <- cbind(vcf[,c(1,2,4,5,6,8,9)],vcf.info)
    

    Something else you may find useful, in this case I am just getting the read depth, replace n with however many samples you have, and the 3 with 1 to 5 for GT,AD,DP,GQ,PL

    selectReadDepth <- apply(vcf[,10:n],2,function(x) sapply(x, function(y) strsplit(y,split=":")[[1]][3]))