Edit: clarification
I'm looking to rename the tip.labels
of my nwk tree in R. An example of the current labels is:
AB177299.1 Uncultured bacterium gene for 16S rRNA, clone: ODP1251B1.3
but I'd just like the numerical ID at the start, so AB177299.1
. My first inclination is to use gsub()
on tree$tip.label
but my issue is that every numerical ID and the following text are different for each tip. I then created a character vector of all numerical IDs which could be used as the tip names, if I were able to accurately replace each ID by using order()
.
tip.ids <- rownames(taxonomy)
tree$tip.label[order(tree$tip.label) %in%
tip.ids] <- tip.ids
This didn't change anything within my tree, and I'm at a loss.
For reproducibility. taxonomy
isn't included because you should only need tip.ids
> dput(tip.ids)
c("AB109878.1", "AB109879.1", "AB109880.1", "AB109881.1", "AB109882.1",
"AB109883.1", "AB109884.1", "AB109885.1")
> dput(tree)
structure(list(edge = structure(c(9L, 10L, 10L, 9L, 11L, 11L,
12L, 13L, 13L, 12L, 14L, 14L, 15L, 15L, 10L, 1L, 2L, 11L, 3L,
12L, 13L, 4L, 5L, 14L, 6L, 15L, 7L, 8L), dim = c(14L, 2L)), edge.length = c(0.0341921975,
5e-09, 0.12821348, 0.000367458500000008, 0.027617765, 0.037677039,
0.028633124, 0.014468092, 5e-09, 0.009763081, 0.078168769, 0.021640684,
0.341568464, 0.092957415), Nnode = 7L, node.label = c("root",
"0.917", "", "0.929", "0.921", "0.302", "0.692"), tip.label = c("'AB109881.1 Uncultured archaeon gene for 16S rRNA, partial sequence, clone:pMLA-4'",
"'AB109880.1 Uncultured archaeon gene for 16S rRNA, partial sequence, clone:pMLA-3'",
"'AB109883.1 Uncultured archaeon gene for 16S rRNA, partial sequence, clone:pMLA-6'",
"'AB109879.1 Uncultured archaeon gene for 16S rRNA, partial sequence, clone:pMLA-2'",
"'AB109884.1 Uncultured archaeon gene for 16S rRNA, partial sequence, clone:pMLA-7'",
"'AB109878.1 Uncultured archaeon gene for 16S rRNA, partial sequence, clone:pMLA-1'",
"'AB109882.1 Uncultured archaeon gene for 16S rRNA, partial sequence, clone:pMLA-5'",
"'AB109885.1 Uncultured archaeon gene for 16S rRNA, partial sequence, clone:pMLA-8'"
)), class = "phylo", order = "cladewise")
Maybe one of these options will work for you.
library(tidyverse, quietly = TRUE)
text <- "'AB109880.1 Uncultured archaeon gene for 16S rRNA, partial sequence, clone:pMLA-3'"
# Option 1
paste0("AB", parse_number(text))
# Option 2
str_split(text, " ")[[1]][1]