Search code examples
rustborrow-checker

Idiomatic way to write a decorator over an iterator in Rust


I am new to Rust and still fighting with the borrow-checker.

I am using this question as my general learning opportunity. So, I would appreciate an explanation that is applicable generally to similar situation instead of a response that simply applies to this particular example.

I want to implement a wrapper over RowIterator. I want to build a struct that holds the RowIterator. On next() and has_next(), I will add the desired behavior that's not meaningful to the context of this question.

Here is how my wrapper code looks like -

use std::{fs::File, path::Path};

use parquet::{
    file::{reader::FileReader, serialized_reader::SerializedFileReader},
    record::reader::RowIter,
};

pub struct PIterator<'a> {
    row_iterator: RowIter<'a>,
}

impl<'a> PIterator<'a> {
    pub fn new(file_path: String) -> PIterator<'a> {
        let path = Path::new(&file_path);
        let file = File::open(path).unwrap();
        let reader = SerializedFileReader::new(file).unwrap();
        let row_iter = reader.get_row_iter(None);  // reader is borrowed here
        PIterator {  // cannot return value referencing local variable `reader`...
            row_iterator: row_iter.unwrap(),
        }
    }

    pub fn has_next(&self) -> bool {
        // TODO:
        false
    }

    pub fn next(&self) -> Option<String> {
        // TODO:
        None
    }
}

I get the concept that when a method is invoked on a mutable reference, the object may mutate leaving the references dangling. So, the ownership transfer must happen.

What I cannot comprehend is - who is the "borrower" here? Is it the method "new"? How we determine that?

Also, in this situation, what is the idiomatic way of wrapping the iterator?


Solution

  • In this particular example I'd tell you to just make a function that creates a RowIter for you as it seems you don't actually need the iterator adapter, but let's assume you need to have it for some reason.

    Your problem is that in PIterator you're trying to refer to something that is going to be dropped. You make a RowIter that refers to the reader and then try to return it even though reader is going to be dropped at the end of new.

    Instead of referring to it, transfer ownership of it. To create an iterator that owns it's data, not just refers to it, you idiomatically use the into_iter function from the IntoIter trait. Once you do that you can remove the lifetime from your struct.

    Then, put your next function into an Iterator impl block so that the language knows it is meant to be Iterator::next and not a function whose name just so happens to be next. As for has_next, you don't have to implement that yourself, just use the existing peekable adapter.

    Some other minor adjustments I made are (more) proper error handling instead of unwrap, and &Path instead of needlessly owned String as the argument to new (If you already have a Path you'd have to needlessly convert it back to String just to call new...).

    use std::{fs::File, io, path::Path};
    
    use parquet::{file::serialized_reader::SerializedFileReader, record::reader::RowIter};
    use thiserror::Error;
    
    pub struct PIterator {
        row_iterator: RowIter<'static>,
    }
    
    #[derive(Debug, Error)]
    pub enum PIteratorCreationError {
        #[error(transparent)]
        IOError(#[from] io::Error),
    
        #[error(transparent)]
        ParquetError(#[from] parquet::errors::ParquetError),
    }
    
    impl PIterator {
        pub fn new(file_path: &Path) -> Result<Self, PIteratorCreationError> {
            let file = File::open(file_path)?;
            let reader = SerializedFileReader::new(file)?;
            let row_iter = reader.into_iter();
    
            Ok(PIterator { row_iterator: row_iter })
        }
    }
    
    impl Iterator for PIterator {
        type Item = <RowIter<'static> as Iterator>::Item;
    
        fn next(&mut self) -> Option<Self::Item> {
            self.row_iterator.next()
        }
    }
    
    fn main() {
        let p_iterator = match PIterator::new(Path::new("Some filename...")) {
            Ok(iterator) => iterator,
            Err(creation_error) => match creation_error {
                PIteratorCreationError::IOError(io_error) => {
                    // handle io error here somehow
                    eprintln!("IO error: {io_error}");
                    return;
                }
                PIteratorCreationError::ParquetError(parquet_error) => {
                    // handle parquet error here somehow
                    eprintln!("Parquet error: {parquet_error}");
                    return;
                }
            },
        };
    
        let mut iterator = p_iterator.peekable();
    
        let has_next: bool = iterator.peek().is_some();
    }