Search code examples
rustclap

How do I use STDIN if no positional arguments are given with clap?


I have a clap App like this:

let m = App::new("test")
    .arg(
        Arg::with_name("INPUT")
            .help("a string to be frobbed")
            .multiple(true),
    )
    .get_matches();

I want to read the arguments as an iterable of strings if there are any myapp str1 str2 str3 but if not, to act as a filter and read an iterable of lines from stdin cat afile | myapp. This is my attempt:

let stdin = io::stdin();
let strings: Box<Iterator<Item = String>> = if m.is_present("INPUT") {
    Box::new(m.values_of("INPUT").unwrap().map(|ln| ln.to_string()))
} else {
    Box::new(stdin.lock().lines().map(|ln| ln.unwrap()))
};

for string in strings {
    frob(string)
}

I believe that, since I am just requiring the Iterator trait, a Box<Iterator<Item = String>> is the only way to go. Is that correct?


Solution

  • There is rarely an "only way to go", and this case is no different. One alternative approach would be to use static dispatch instead of dynamic dispatch.

    Your main processing code needs an iterator of strings as input. So you could define a processing function like this:

    fn process<I: IntoIterator<Item = String>>(strings: I) {
        for string in strings {
            frob(string);
        }
    }
    

    The invocation of this code could look like this:

    match m.values_of("INPUT") {
        Some(values) => process(values.map(|ln| ln.to_string())),
        None => process(io::stdin().lock().lines().map(|ln| ln.unwrap())),
    }
    

    The compiler will emit two different versions of process(), one for each iterator type. Each version statically calls the iterator functions it is compiled for, and there is only a single dispatch to the right function in the match statement.

    (I probably got some details wrong here, but you get the idea.)

    Your version, on the other hand, uses the type Box<dyn Iterator<Item = String>>, so the iterators will be allocated on the heap, and there will be a dynamic dispatch each time next() is called on the iterator. Which is probably fine.

    There are certainly more ways of structuring the code and dispatching between the two different kinds of input, e.g. using the Either type from the either crate, or simply writing two different for loops for the two cases. Which one to choose depends on tradeoffs with other requirements of your code, your performance requirements and your personal preferences.