Search code examples
stringarraysrusttrim

How to implement trim for Vec<u8>?


Rust provides a trim method for strings: str.trim() removing leading and trailing whitespace. I want to have a method that does the same for bytestrings. It should take a Vec<u8> and remove leading and trailing whitespace (space, 0x20 and htab, 0x09).

Writing a trim_left() is easy, you can just use an iterator with skip_while(): Rust Playground

fn main() {
    let a: &[u8] = b"     fo o ";
    let b: Vec<u8> = a.iter().map(|x| x.clone()).skip_while(|x| x == &0x20 || x == &0x09).collect();
    println!("{:?}", b);
}

But to trim the right characters I would need to look ahead if no other letter is in the list after whitespace was found.


Solution

  • Here's an implementation that returns a slice, rather than a new Vec<u8>, as str::trim() does. It's also implemented on [u8], since that's more general than Vec<u8> (you can obtain a slice from a vector cheaply, but creating a vector from a slice is more costly, since it involves a heap allocation and a copy).

    trait SliceExt {
        fn trim(&self) -> &Self;
    }
    
    impl SliceExt for [u8] {
        fn trim(&self) -> &[u8] {
            fn is_whitespace(c: &u8) -> bool {
                *c == b'\t' || *c == b' '
            }
    
            fn is_not_whitespace(c: &u8) -> bool {
                !is_whitespace(c)
            }
    
            if let Some(first) = self.iter().position(is_not_whitespace) {
                if let Some(last) = self.iter().rposition(is_not_whitespace) {
                    &self[first..last + 1]
                } else {
                    unreachable!();
                }
            } else {
                &[]
            }
        }
    }
    
    fn main() {
        let a = b"     fo o ";
        let b = a.trim();
        println!("{:?}", b);
    }
    

    If you really need a Vec<u8> after the trim(), you can just call into() on the slice to turn it into a Vec<u8>.

    fn main() {
        let a = b"     fo o ";
        let b: Vec<u8> = a.trim().into();
        println!("{:?}", b);
    }