I have a string, say "dog cat fish"
, that I want to split on the first whitespace into two slices that look like this: ("dog", "cat fish")
.
I tried to naively use the split_once()
method like this:
let string = "dog cat fish";
let (first_word, rest_of_string) = string.split_once(' ').unwrap();
It works effectively for regular whitespace characters. However, I would like it to work also for other types of Unicode whitespace characters like \t
as the split_whitespace()
method does.
I don't want to use split_whitespace()
, though, because it returns an iterator and I would have to recollect and join the words after iterating, as it would be a waste of time:
let mut it = string.split_whitespace();
let first_word = it.next().unwrap();
let rest_of_string = it.collect::Vec<&str>().join(" ");
So, in case I had a string like "dog \t cat fish"
, how could I split it to obtain these two slices ("dog", "cat fish")
?
I also thought of using regular expressions, but is there a better method?
You could split_once
with a function that calls char::is_whitespace()
, but that will only split on the first whitespace. You'll then need to trim the second &str
from the start.
fn main() {
let string = "dog \t cat fish";
let (a, b) = string.split_once(char::is_whitespace).unwrap();
let b = b.trim_start();
dbg!(a, b);
}
Output:
[src/main.rs:5] a = "dog"
[src/main.rs:5] b = "cat fish"