I'm trying to use the parquet
library to create a record iterator object in a function that can be iterated using my own trait called RecordIterator
. It looks like this:
fn blah(dataset_info: DatasetInfo, file: File) -> Result<Box<dyn RecordIterator<Item=Record>>, Box<dyn Error>> {
let reader = SerializedFileReader::new(file).unwrap();
let iter = (&reader).get_row_iter(None).unwrap();
let column_type = *dataset_info.column_type.clone();
let iterator = ParquetRecordIterator {
iterator: iter,
column_type: column_type,
i: 0,
};
Ok(Box::new(iterator))
}
The problem is that because iter has a lifetime attached to the SerializedFileReader variable it is created from, returning the ParquetRecordIterator (which implements the RecordIterator trait) object complains with an error saying:
cannot return value referencing local variable `reader` [E0515]
I would ideally not like to break the abstraction here, so how would you suggest implementing this function? Effectively, I would like to break the lifetime link between the reader and the iterator, but not sure how best I could do that, or use a different parquet API to do this.
I've tried wrapping the file reader in a Box::new
, hoping that the lifetimes would not be tied to an object in the heap, but unfortunately that doesn't seem to work.
I've NOT tried any libraries to deal with self-referential structs because it doesn't seem like they're recommended, so I was hoping for a standard way to solve this.
I've NOT tried looking into other parquet libraries.
This is a hard problem in Rust since you're essentially trying to create a self-referential struct.
Fortunately for you, you don't need to solve this problem in general, since the library you're using provides direct support for this use-case: impl IntoIterator for SerializedFileReader<File>
. So:
fn blah(dataset_info: DatasetInfo, file: File) -> Result<Box<dyn RecordIterator<Item=Record>>, Box<dyn Error>> {
let reader = SerializedFileReader::new(file).unwrap();
let iter = reader.into_iter();
let column_type = *dataset_info.column_type.clone();
let iterator = ParquetRecordIterator {
iterator: iter,
column_type: column_type,
i: 0,
};
Ok(Box::new(iterator))
}