Search code examples
stringrustsplitwhitespace

Split string once on the first whitespace in Rust


I have a string, say "dog cat fish", that I want to split on the first whitespace into two slices that look like this: ("dog", "cat fish").
I tried to naively use the split_once() method like this:

let string = "dog cat fish";
let (first_word, rest_of_string) = string.split_once(' ').unwrap();

It works effectively for regular whitespace characters. However, I would like it to work also for other types of Unicode whitespace characters like \t as the split_whitespace() method does.
I don't want to use split_whitespace(), though, because it returns an iterator and I would have to recollect and join the words after iterating, as it would be a waste of time:

let mut it = string.split_whitespace();
let first_word = it.next().unwrap();
let rest_of_string = it.collect::Vec<&str>().join(" ");

So, in case I had a string like "dog \t cat fish", how could I split it to obtain these two slices ("dog", "cat fish")?
I also thought of using regular expressions, but is there a better method?


Solution

  • You could split_once with a function that calls char::is_whitespace(), but that will only split on the first whitespace. You'll then need to trim the second &str from the start.

    fn main() {
        let string = "dog \t cat fish";
        let (a, b) = string.split_once(char::is_whitespace).unwrap();
        let b = b.trim_start();
        dbg!(a, b);
    }
    

    Output:

    [src/main.rs:5] a = "dog"
    [src/main.rs:5] b = "cat fish"
    

    Playground