Search code examples
c++ubuntu-18.04onnxruntime

How can i fix Onnxruntime session->Run problem?


I am trying to write a wrapper for onnxruntime. The model receives one tensor as an input and one tensor as an output. During session->Run, a segmentation error occurs inside the onnxruntime library. Both downloaded library and built from source throw the same error.

Here is error:

Thread 1 "app" received signal SIGSEGV, Segmentation fault.
0x00007ffff6b16eb1 in onnxruntime::logging::ISink::Send (this=0x5555559154c0, timestamp=..., logger_id="", message=...) at /home/listray/Work/Libs/onnxruntime/include/onnxruntime/core/common/logging/isink.h:23
23      SendImpl(timestamp, logger_id, message);

Here is bt:

#0  0x00007ffff6b16eb1 in onnxruntime::logging::ISink::Send (this=0x5555559154c0, timestamp=..., logger_id="", message=...)
    at /home/listray/Work/Libs/onnxruntime/include/onnxruntime/core/common/logging/isink.h:23
#1  0x00007ffff6b174b8 in onnxruntime::logging::LoggingManager::Log (this=0x55555576cbb0, logger_id="", message=...) at /home/listray/Work/Libs/onnxruntime/onnxruntime/core/common/logging/logging.cc:153
#2  0x00007ffff6b16cae in onnxruntime::logging::Logger::Log (this=0x7fffffffcdd0, message=...) at /home/listray/Work/Libs/onnxruntime/include/onnxruntime/core/common/logging/logging.h:291
#3  0x00007ffff6b16ce0 in onnxruntime::logging::Capture::~Capture (this=0x7fffffffc4e0, __in_chrg=<optimized out>) at /home/listray/Work/Libs/onnxruntime/onnxruntime/core/common/logging/capture.cc:57
#4  0x00007ffff6a86301 in onnxruntime::SequentialExecutor::Execute(onnxruntime::SessionState const&, std::vector<int, std::allocator<int> > const&, std::vector<OrtValue, std::allocator<OrtValue> > const&, std::vector<int, std::allocator<int> > const&, std::vector<OrtValue, std::allocator<OrtValue> >&, std::unordered_map<unsigned long, std::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtMemoryInfo const&, OrtValue&, bool&)>, std::hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<std::pair<unsigned long const, std::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtMemoryInfo const&, OrtValue&, bool&)> > > > const&, onnxruntime::logging::Logger const&) (this=0x5555559da4c0, session_state=..., feed_mlvalue_idxs=std::vector of length 1, capacity 1 = {...}, 
    feeds=std::vector of length 1, capacity 1 = {...}, fetch_mlvalue_idxs=std::vector of length 1, capacity 1 = {...}, fetches=std::vector of length 1, capacity 1 = {...}, 
    fetch_allocators=std::unordered_map with 0 elements, logger=...) at /home/listray/Work/Libs/onnxruntime/onnxruntime/core/framework/sequential_executor.cc:309
#5  0x00007ffff6a6d787 in onnxruntime::utils::ExecuteGraphImpl(const onnxruntime::SessionState &, const onnxruntime::FeedsFetchesManager &, const std::vector<OrtValue, std::allocator<OrtValue> > &, std::vector<OrtValue, std::allocator<OrtValue> > &, const std::unordered_map<long unsigned int, std::function<onnxruntime::common::Status(const onnxruntime::TensorShape&, const OrtMemoryInfo&, OrtValue&, bool&)>, std::hash<long unsigned int>, std::equal_to<long unsigned int>, std::allocator<std::pair<long unsigned int const, std::function<onnxruntime::common::Status(const onnxruntime::TensorShape&, const OrtMemoryInfo&, OrtValue&, bool&)> > > > &, ExecutionMode, const bool &, const onnxruntime::logging::Logger &, bool) (session_state=..., feeds_fetches_manager=..., feeds=std::vector of length 1, capacity 1 = {...}, 
    fetches=std::vector of length 1, capacity 1 = {...}, fetch_allocators=std::unordered_map with 0 elements, execution_mode=ORT_SEQUENTIAL, terminate_flag=@0x7fffffffd168: false, logger=..., 
    only_execute_path_to_fetches=false) at /home/listray/Work/Libs/onnxruntime/onnxruntime/core/framework/utils.cc:454
#6  0x00007ffff6a6df37 in onnxruntime::utils::ExecuteGraph (session_state=..., feeds_fetches_manager=..., feeds=std::vector of length 1, capacity 1 = {...}, fetches=std::vector of length 1, capacity 1 = {...}, 
    execution_mode=ORT_SEQUENTIAL, terminate_flag=@0x7fffffffd168: false, logger=..., only_execute_path_to_fetches=false) at /home/listray/Work/Libs/onnxruntime/onnxruntime/core/framework/utils.cc:513
#7  0x00007ffff63e00c2 in onnxruntime::InferenceSession::Run (this=0x555555917110, run_options=..., feed_names=std::vector of length 1, capacity 1 = {...}, feeds=std::vector of length 1, capacity 1 = {...}, 
    output_names=std::vector of length 1, capacity 1 = {...}, p_fetches=0x7fffffffd120) at /home/listray/Work/Libs/onnxruntime/onnxruntime/core/session/inference_session.cc:1206
#8  0x00007ffff637ecc3 in OrtApis::Run (sess=0x555555917110, run_options=0x0, input_names=0x5555559c1a10, input=0x7fffffffd2f8, input_len=1, output_names1=0x555555a521a0, output_names_len=1, 
    output=0x555555a3fb30) at /home/listray/Work/Libs/onnxruntime/onnxruntime/core/session/onnxruntime_c_api.cc:506
#9  0x00007ffff7ba6a93 in Ort::Session::Run (this=0x555555916440, run_options=..., input_names=0x5555559c1a10, input_values=0x7fffffffd2f8, input_count=1, output_names=0x555555a521a0, 
    output_values=0x555555a3fb30, output_count=1) at /home/listray/Work/Libs/onnx_debug/include/onnxruntime_cxx_inline.h:246
#10 0x00007ffff7ba69da in Ort::Session::Run (this=0x555555916440, run_options=..., input_names=0x5555559c1a10, input_values=0x7fffffffd2f8, input_count=1, output_names=0x555555a521a0, output_names_count=1)
    at /home/listray/Work/Libs/onnx_debug/include/onnxruntime_cxx_inline.h:237
#11 0x00007ffff7bb0b31 in ai::common::OnnxruntimeGenericModelWrapper<1ul, 1ul>::process (this=0x55555576cb60, tensors=...)
    at /home/listray/Work/Projects/ml-library/framework/onnxruntime/onnx_generic_model_wrapper.h:48
    ...

The downloaded library stops at onnxruntime::logging::LoggingManager::Log. Here is some wrapper code. Loading the model:

void load_graph(const ByteBuffer& model)
            {
                // enviroment maintains thread pools and other state info
                Ort::Env env(ORT_LOGGING_LEVEL_WARNING, "Vicue Run");
                // initialize session options
                Ort::SessionOptions session_options(nullptr);
                //session_options.SetIntraOpNumThreads(1);

                //Loading models
                session = std::make_unique<Ort::Session>(env,
                                                         static_cast<const void*>(model.data.get()),
                                                         model.length,
                                                         session_options);
            }

session is wrapper's field:

std::unique_ptr<Ort::Session> session;

ByteBuffer:

struct ByteBuffer
    {
        std::unique_ptr<char[]> data;
        size_t length;
    }

Actually wrapper was generic, but this code gets the same error.

std::array<Tensor, outputs> process(std::array<Tensor, inputs> tensors) override
            {
                std::array<Tensor, outputs> result;

                // maybe this should be different if we have multiple input
                Ort::AllocatorWithDefaultOptions allocator;
                auto memory_info = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault);

                if(outputs == 1 && inputs == 1) {
                    auto input_shape = session->GetInputTypeInfo(0).GetTensorTypeAndShapeInfo().GetShape();

                    Ort::Value input_tensor = Ort::Value::CreateTensor<float>(memory_info,
                                                                              tensors[0].data.data(),
                                                                              tensors[0].data.size(),
                                                                              input_shape.data(),
                                                                              input_shape.size());

                    std::vector<const char*> input_node_names = { session->GetInputName(0, allocator) };
                    std::vector<const char*> output_node_names = { session->GetOutputName(0, allocator) };
                    std::vector<Ort::Value> output_tensors = session->Run(Ort::RunOptions{nullptr},
                                                                          input_node_names.data(),
                                                                          &input_tensor,
                                                                          inputs,
                                                                          output_node_names.data(),
                                                                          outputs);

One strange thing that I don't understand. During an error i see this:

(gdb) print this
$4 = (onnxruntime::logging::Capture * const) 0x7fffffffc4e0
(gdb) print this->logger_->logging_manager_->sink_
$5 = std::unique_ptr<onnxruntime::logging::ISink> = {get() = 0x5555559154c0}
(gdb) print *(this->logger_->logging_manager_->sink_)
$6 = {_vptr.ISink = 0x0}

When the logger is created, its *(logging_manager_->sink_) is also {_vptr.ISink = 0x0}.


Solution

  • I do not have the complete idea of your code structure but try making Ort::Env variable static.