Search code examples
stringrustcharacter

How to extract first/last alphanumeric character from String


I have a seemingly simple problem of extracting a first and a last alphanumeric character from a String in Rust. Please consider my minimalistic example below:

fn main() {
    let s: String = "Peace to the world!".to_string();

    // check: string is not empty
    if !s.is_empty() {

        let s    = s.to_uppercase();

        let fidx = s.find(char::is_alphanumeric).unwrap();
        println!("fidx: {fidx}");

        let fchar = s.get(fidx..fidx).unwrap();
        println!("fchar: {fchar}");

        let lidx = s.rfind(char::is_alphanumeric).unwrap();
        println!("lidx: {lidx}");

        let lchar = s.get(lidx..lidx).unwrap();
        println!("lchar: {lchar}");
    }
}

My code returns empty characters for both fchar and lchar. How do I obtain the these characters from my string? Would you also please check my code---I'm not certain this is the shortest and most elegant solution.


Solution

  • fidx..fidx is always empty because the second value is exclusive.

    You probably meant to use an inclusive range:

    let fchar = &s[fidx..=fidx];
    

    But technically, that doesn't give you the first character, it gives you a &str of the first character and only works if it is exactly 1 byte wide in it's UTF-8 representation.

    So instead, if you don't need the index, you can get the first char that fulfills a criterium directly with this:

    let fchar = s.chars().find(|c| c.is_alphanumeric()).unwrap();
    

    Or both at once with this:

    let (fidx, fchar) = s.char_indices().find(|(_, c)| c.is_alphanumeric()).unwrap();
    

    Note: .char_indices() gives you an iterator over the byte index and correspeonding character, which is what your code calculates. If you're after the character index you can replace it with .chars().enumerate()

    For the last of either just replace find with rfind:

    let lchar = s.chars().rfind(|c| c.is_alphanumeric()).unwrap();
    let (lidx, lchar) = s.char_indices().rfind(|(_, c)| c.is_alphanumeric()).unwrap();