I have a table where one of my Columns (mydata$Gene) has some ID's which are in the format:
ENSG00000000419.8
ENSG00000000460.12
I wish to understand how to use the strsplit function to remove the .xx part
so I want all my outputs to come out as
ENSG00000000419
ENSG00000000460
etc
so far I have attempted the following code:
strsplit(mydata$Gene, ".", fixed=TRUE)
but get the error:
Error in strsplit(mydata$Gene, ".", fixed = TRUE) : non-character argument
and also
strsplit(mydata$Gene, "\.", fixed=TRUE)
Error: '.' is an unrecognized escape in character string starting ""."
any suggestions?
thank you for your time.
This works, because your data looks like its a factor:
> strsplit(as.character(mydata$Gene), ".", fixed=TRUE)
[[1]]
[1] "ENSG00000000419" "8"
[[2]]
[1] "ENSG00000000460" "12"
but you might do better by doing a replacement substitute if all you want is the text before the dot:
> sub("\\..*$","",mydata$Gene)
[1] "ENSG00000000419" "ENSG00000000460"
>