Search code examples
splitdatacolumn

Splitting a column of numbers separated by a white space character.


I would like to split a column in a data frame to two different columns. The numbers in this column are separated by a white space. I have seen similar answers here and I have tried to follow the same line but have not been succesful. Don't know what I am doing wrong. My original data looks like this:
A .snp file

I would like to split col5 into separate columns for the reference and variant allele columns. My latest trial is as follows:

    `df <- cSplit(snpfile, snpfile$col5, " ")`

names(df) <- paste0(sub("(.*_).*", "\\ ", names(df)), c("REF", "VAR"))

I would really appreciate any help on understanding how to do this as it seems easy enough. Thank you all.


Solution

  • I have found a solution after studying more carefully the help files for the "cSplit" function. I only needed to change

    df <- cSplit(snpfile, snpfile$col5, " ") to df <- cSplit(snpfile, "col5", " ")

    and I didn't use the sub function for naming the new columns. I just used the very basic way of renaming new columns with function "colnames."