Search code examples
rustiteratorunsafestandard-library

Are there any rust functions for wrapping an iterator that is dependent on a reference so the wrapper contains the referent?


In this case I want to read integers from standard input such that they are separated by spaces and newline. My first attempt was similar to the following code:

fn splitter(x: String) -> impl Iterator<Item=&'static str> {
    x.as_str().split_whitespace()
}

fn valuereader<A: std::str::FromStr>() -> impl Iterator<Item=A> 
where <A as std::str::FromStr>::Err: std::fmt::Debug
{
    let a = std::io::stdin().lines();
    let b = a.map(Result::unwrap);
    let c = b.flat_map(splitter);
    c.map(|x|x.parse().expect("Not an integer!"))
}

fn main() {
    let temp: Vec<usize> = valuereader().collect();
    println!("{:?}", temp);
}

The problem is that split_whitespace wants a &str, but std::io::stdin().lines() returns an owned String. I don't want to use x.as_str().split_whitespace().collect(), because I don't want to allocate a temporary vector.

The best solution I could come up with was to use a wrapper that contains the owned String and the iterator that depends on the String, using unsafe code. The wrapper's implementation of Iterator is simply a wrapper for the iterator that depends on the String. This was the result:

mod move_wrapper {
    use std::pin::Pin;
    pub fn to_wrapper<'b, A: 'b, F, B: 'b> (a: A, f: F) -> Wrapper<A,B>
    where
        F: FnOnce (&'b A) -> B
    {
        let contained_a = Box::pin(a);
        // Here is the use of unsafe. It is necessary to create a reference to a that can live as long as long as needed.
        // This should not be dangerous as no-one outside this module will be able to copy this reference, and a will live exactly as long as b inside Wrapper.
        let b = f(unsafe{&*core::ptr::addr_of!(*contained_a)});
        Wrapper::<A,B> {_do_not_use:contained_a, dependent:b}
    }

    pub struct Wrapper<A,B> {
        _do_not_use: Pin<Box<A>>,
        dependent: B
    }

    impl<A,B: Iterator> Iterator for Wrapper<A,B>
    {
        type Item = B::Item;
        fn next(&mut self) -> Option<Self::Item> {
            self.dependent.next()
        }
    }
}

fn splitter(x: String) -> impl Iterator<Item=&'static str> {
    move_wrapper::to_wrapper(x, |a|a.as_str().split_whitespace())
}

fn valuereader<A: std::str::FromStr>() -> impl Iterator<Item=A> 
where <A as std::str::FromStr>::Err: std::fmt::Debug
{
    let a = std::io::stdin().lines();
    let b = a.map(Result::unwrap);
    let c = b.flat_map(splitter);
    c.map(|x|x.parse().expect("Not an integer!"))
}

fn main() {
    let temp: Vec<usize> = valuereader().collect();
    println!("{:?}", temp);
}

Now to the actual question. How would you do this as idiomatic as possible, if possible without using any unsafe code (does the function here called to_wrapper exist)? Have I written safe unsafe code? Is there any way to make my Wrapper work for all traits, not just Iterator?

EDIT

To be clearer, this question is about creating a method you can apply anytime you have to give ownership to something that wants a reference, not about how to read from standard input and parse to integers.


Solution

  • Disclaimer: I don't know why anyone would want to use the method described here; it is better to implement the lending iterator manually or avoid using it in the first place. But this method is at least cool.

    My original solution in the question is not sound, as the items in the iterator have the type &'static str which is wrong. As drewtato mentioned, we need to use lending iterators, in this answer I will use the lending-iterator crate.

    I was not able to do this using a single generic wrapper function because of the need for one of the types to be covariant, so I will use a macro instead.

    To create the self-referential struct, I will use self_cell.

    The following code can then be put in a crate and used for "wrapping an iterator that is dependent on a reference so the wrapper contains the referent":

    pub use self_cell;
    pub use lending_iterator;
    pub use lending_iterator::HKT;
    pub use lending_iterator::LendingIterator;
    
    #[macro_export]
    macro_rules! create_wrapper {
        ($wrapper_name:ident, $itertype:ty, $input_type:ty, $create_exp:expr) => {
            type Dependent<'a> = <$itertype as $crate::lending_iterator::higher_kinded_types::WithLifetime<'a>>::T;
            $crate::self_cell::self_cell!(
                pub struct $wrapper_name {
                    owner: $input_type,
                    #[not_covariant]
                    dependent: Dependent,
                }
            );
            impl $wrapper_name {
                fn create(a: $input_type) -> Self {
                    Self::new(a, $create_exp)
                }
            }
    
            #[$crate::lending_iterator::prelude::gat]
            impl $crate::lending_iterator::prelude::LendingIterator for $wrapper_name {
                type Item<'next> = <<$itertype as $crate::lending_iterator::higher_kinded_types::WithLifetime<'next>>::T as ::core::iter::Iterator>::Item;
                fn next<'a>(&'a mut self) -> ::core::option::Option<Self::Item<'a>> {
                    self.with_dependent_mut(|_, iter|iter.next())
                }
            }
        };
    }
    

    When the above macro is expanded, it creates a wrapper type using self_cell. Then it creates a function to create it and it implements LendingIterator for it. Unfortunately, the type Dependent is visible outside of the macro, and I didn't come up with a way to fix that.

    Solving the original problem by using this code is not directly possible. This is because flat_map doesn't exist in the lending_iterator library. Instead I will flatten after the final map when we have a normal iterator again.

    I will call the library above iter_to_lending.

    use iter_to_lending::LendingIterator;
    
    iter_to_lending::create_wrapper!(SplitWhitespaceWrapper, iter_to_lending::HKT!(core::str::SplitWhitespace<'_>), String, |data|data.split_whitespace());
    
    pub fn splitter(x: String) -> SplitWhitespaceWrapper {
        SplitWhitespaceWrapper::create(x)
    }
    
    fn valuereader<A: std::str::FromStr>() -> impl Iterator<Item=A> 
    where <A as std::str::FromStr>::Err: std::fmt::Debug
    {
        let a = std::io::stdin().lines();
        let b = a.map(Result::unwrap);
        let c = b.map(splitter);
        c.map(|inner|inner.map_into_iter(|x|x.parse().expect("Not an integer!"))).flatten()
    }
    
    fn main() {
        let temp: Vec<usize> = valuereader().collect();
        println!("{:?}", temp);
    }
    

    Note that no unsafe was used. This means that this method is probably sound, except for possible compiler bugs invoked by lending_iterator (see danielhenrymantilla/lending-iterator.rs#5) or bugs in self_cell.

    Also note that the answer by drewtato is better for my example-problem, but it doesn't work in general.