I want to parse a string like "ParseThis"
or "parseThis"
into a vector of strings like ["Parse", "This"]
or ["parse", "this"]
using the nom crate.
All attempts I've tried do not return the expected result. It's possible that I don't understand yet how to use all the functions in nom.
I tried:
named!(camel_case<(&str)>,
map_res!(
take_till!(is_not_uppercase),
std::str::from_utf8));
named!(p_camel_case<&[u8], Vec<&str>>,
many0!(camel_case));
But p_camel_case
just returns a Error(Many0)
for parsing a string that starts with an uppercase letter and for parsing a string that starts with a lowercase letter it returns Done
but with an empty string as a result.
How can I tell nom that I want to parse the string, separated by uppercase letters (given there can be a first uppercase or lowercase letter)?
You are looking for things that start with any character, followed by a number of non-uppercase letters. As a regex, that would look akin to .[a-z]*
. Translated directly to nom, that's something like:
#[macro_use]
extern crate nom;
use nom::anychar;
fn is_uppercase(a: u8) -> bool { (a as char).is_uppercase() }
named!(char_and_more_char<()>, do_parse!(
anychar >>
take_till!(is_uppercase) >>
()
));
named!(camel_case<(&str)>, map_res!(recognize!(char_and_more_char), std::str::from_utf8));
named!(p_camel_case<&[u8], Vec<&str>>, many0!(camel_case));
fn main() {
println!("{:?}", p_camel_case(b"helloWorld"));
// Done([], ["hello", "World"])
println!("{:?}", p_camel_case(b"HelloWorld"));
// Done([], ["Hello", "World"])
}
Of course, you probably need to be careful about actually matching proper non-ASCII bytes, but you should be able to extend this in a straight-forward manner.