I want to clean up a taxonomy table with bacterial species in R
and I want to delete values from all cells that start with the small letter.
I have a column from taxonomy df:
Species |
Tuwongella immobilis |
Woesebacteria |
unidentified marine |
bacterium Ellin506 |
And I want:
Species |
Tuwongella immobilis |
Woesebacteria |
unwanted <- "^[:upper:]+[:lower:]+"
tax.clean$Species <- str_replace_all(tax.clean$Species, unwanted, "")
but it doesn't seem to work and does not match desired species.
If you are working with dataframe, I suggest using dplyr::filter
to clean up the dataframe.
returns logical values, !grepl(^[[:lower:]])
looks for anything that does not start with a lower case letter (^
indicate the beginning of a string).
df %>% filter(!grepl("^[[:lower:]]", Species))
1 Tuwongella immobilis
2 Woesebacteria