I've got an AWS Lambda written in Rust, using the Rust Lambda Runtime. Within that Lambda, I'd like to use Polars to lazily load a Parquet file from S3 and perform some transformations on it before writing it back to another S3 bucket.
The issue I'm having (at least, I think this is the issue) is that the Polars Cloud Storage implementation seems to use tokio's block_on method, but the Lambda runtime already uses a tokio runner so I'm getting the following error:
Cannot start a runtime from within a runtime. This happens because a function (like `block_on`) attempted to block the current thread while the thread is being used to drive asynchronous tasks.
The code I'm using to lazily load the Parquet file is as follows:
let path = "s3://my_bucket/example.parquet"
let args = ScanArgsParquet::default();
match LazyFrame::scan_parquet(path, args) {
Ok(lf) => lf,
Err(_) => return Err(ReadError::ParquetError),
}
I'm relatively new to Rust, so is there anything I can do to work around this? Or am I going to have to download the file into memory myself using the SDK and then load it (in a non-lazy fashion)?
Based on the comment from @Chayim Friedman on the original post, here's what I got working in the end using tokio's task::spawn_blocking
:
let res = task::spawn_blocking(move || {
let args = ScanArgsParquet::default();
match LazyFrame::scan_parquet(path, args) {
Ok(lf) => Ok(lf),
Err(e) => Err(e),
}
}).await;
match res {
Ok(r) => match r {
Ok(lf) => Ok(lf),
Err(_) => Err(ReadError::ParquetError)
},
Err(_) => Err(ReadError::ThreadError)
}