Search code examples
c++parquetshared-ptrapache-arrow

how to declare and then initialize a parquet arrow::Rand omAccessFile in C++


I am trying to readapt this piece of code of the parquet C++ documentation to open a parquet file with a FileReader to get the minimum value of a column.

arrow::MemoryPool* pool = arrow::default_memory_pool();
std::shared_ptr<arrow::io::RandomAccessFile> input;
ARROW_ASSIGN_OR_RAISE(input, arrow::io::ReadableFile::Open(path_to_file));

// Open Parquet file reader
std::unique_ptr<parquet::arrow::FileReader> arrow_reader;
ARROW_RETURN_NOT_OK(parquet::arrow::OpenFile(input, pool, &arrow_reader));

// Read entire file as a single Arrow table
std::shared_ptr<arrow::Table> table;
ARROW_RETURN_NOT_OK(arrow_reader->ReadTable(&table));

However, the macro ARROW_ASSIGN_OR_RAISE_ERROR is causing the error:

error: cannot convert 'const arrow::Status' to 'int' in return

   27 |    ARROW_ASSIGN_OR_RAISE(input, arrow::io::ReadableFile::Open(parquet_path.c_str()));
      |    ^
      |    |
      |    const arrow::Status

The macro documentation says:

Execute an expression that returns a Result, extracting its value into the variable defined by lhs (or returning a Status on error).

So I tried to do the passages explicitly separating the initialisation from the declaration, something like:

std::shared_ptr<arrow::io::RandomAccessFile> input(nullptr);
input = std::make_shared<arrow::io::RandomAccessFile>(arrow::io::ReadableFile::Open(parquet_path.c_str()));
   

but I am getting the error

error: invalid new-expression of abstract class type 'arrow::io::RandomAccessFile'
  146 |  { ::new((void *)__p) _Up(std::forward<_Args>(__args)...); }

How can I initialize the shared pointer of arrow::io::RandomAccessFile after the declaration?


Solution

  • You don't need std::make_shared, arrow::io::ReadableFile::Open() returns a shared pointer.

    std::shared_ptr<arrow::io::RandomAccessFile> input =
        arrow::io::ReadableFile::Open(parquet_path.c_str()).ValueOrDie();