What is the correct method in rust, to take a character such as ἄ return the normal α without accents? (For example, do a unicode nfc
to nfd
conversion so that the α is separated from the ᾽ and ´)
I suspect this rust documentation page reveals the function I need, but there is no example code which gives me a clue how to use it.
https://docs.rs/unicode-normalization/latest/unicode_normalization/char/fn.decompose_canonical.html
I know this rather hacky convoluted code works, but it seems not the correct option:
let s:Vec<char> = c.to_string().nfkd().collect();
s[0] // <--- unaccented
The function you pass in is called for each character in the decomposition. The first character it's called for is the one you are interested in. Example code:
use unicode_normalization::char::decompose_canonical;
fn main () {
let mut base_char = None;
decompose_canonical('ἄ', |c| { base_char.get_or_insert(c); });
dbg!(base_char);
}