Search code examples
cgosegmentation-faultincomplete-typecatboost

How to build Catboost C Evaluation Library API?


I had to use a Catboost model in some programming languages, Golang and Python. The best option (for performance and compatibility) is to use an evaluation library which can be a C or C++ API. I followed the official documentation to compile the C API, but it has a lot of problems to solve so that work.

These are the problems we encountered while trying to create the evaluation library in C:

1.

error: variable has incomplete type 'ModelCalcerHandle' (aka 'void')
    ModelCalcerHandle modelHandle;
c_wrapper.c:16:13: warning: incompatible pointer types passing 'float (*)[3]' to parameter of type 'const float **' [-Wincompatible-pointer-types]
            &floatFeatures, 3,
            ^~~~~~~~~~~~~~
/Users/eli/workspace/test_c_api/catboost/catboost/libs/model_interface/c_api.h:151:19: note: passing argument to parameter 'floatFeatures' here
    const float** floatFeatures, size_t floatFeaturesSize,
                  ^
c_wrapper.c:17:13: warning: incompatible pointer types passing 'char *(*)[4]' to parameter of type 'const char ***' [-Wincompatible-pointer-types]
            &catFeatures, 4,
            ^~~~~~~~~~~~
/Users/eli/workspace/test_c_api/catboost/catboost/libs/model_interface/c_api.h:152:19: note: passing argument to parameter 'catFeatures' here
    const char*** catFeatures, size_t catFeaturesSize,
                  ^
c_wrapper.c:18:13: warning: incompatible pointer types passing 'double (*)[1]' to parameter of type 'double *' [-Wincompatible-pointer-types]
            &result, 1
            ^~~~~~~
/Users/eli/workspace/test_c_api/catboost/catboost/libs/model_interface/c_api.h:153:13: note: passing argument to parameter 'result' here
    double* result, size_t resultSize);

Solution:

  1. We have solved problem #1 by redefining the modelHandle variable as:
ModelCalcerHandle *modelHandle = ModelCalcerCreate();

After this change it was posible to compile the C program, but we got a new error:

[1]    6489 segmentation fault  ./program
  1. The segmentation fault is related to the warnings listed in issue #2. We had to redefine the variables to solve it:
float floatFeaturesRaw[100];
const float *floatFeatures = floatFeaturesRaw;
const char *catFeaturesRaw[2] = {"1", "2"};
const char **catFeatures = catFeaturesRaw;
double resultRaw[1];
double *result = resultRaw;

and

if (!CalcModelPredictionSingle(
        modelHandle,
        &floatFeatures, 3,
        &catFeatures, 4,
        result, 1)) //We remove `&`
{
   printf("CalcModelPrediction error message: %s\n", GetErrorString());
}

I'll add the complete solution, from code fixes to how to compile C code, in a comment.


Solution

  • Here is the complete solution:

    1. Clone catboost repo:

    git clone https://github.com/catboost/catboost.git

    1. Open the catboost directory from the local copy of the CatBoost repository.

    2. Build the evaluation library (I've chosen the shared library, but you can select what you need). In my case I had to change the --target-platform argument, I was using a Mac M1 with macOS Ventura 13.1 and the clang version was 14.0.0:

    ./ya make -r catboost/libs/model_interface --target-platform CLANG14-DARWIN-ARM64
    
    1. Create the C file. Fixed C sample code:
    #include <stdio.h>
    #include <c_api.h>
    
    int main()
    {
        float floatFeaturesRaw[3] = {0, 89, 1};
        const float *floatFeatures = floatFeaturesRaw;
        const char *catFeaturesRaw[4] = {"Others", "443_HTTPS", "6", "24"};
        const char **catFeatures = catFeaturesRaw;
        double resultRaw[4];
        double *result = resultRaw;
    
        ModelCalcerHandle *modelHandle = ModelCalcerCreate();
        if (!LoadFullModelFromFile(modelHandle, "catboost_model"))
        {
            printf("LoadFullModelFromFile error message: %s\n", GetErrorString());
        }
        SetPredictionType(modelHandle, 3);
        if (!CalcModelPredictionSingle(
                modelHandle,
                floatFeatures, 3,
                catFeatures, 4,
                result, 4))
        {
            printf("CalcModelPrediction error message: %s\n", GetErrorString());
        }
        printf("%f\n", result[0]);
        printf("%f\n", result[1]);
        printf("%f\n", result[2]);
        printf("%f\n", result[3]);
        ModelCalcerDelete(modelHandle);
    }
    

    To Consider:

    • I have set SetPredictionType to APT_PROBABILITY
    • Our model predicts multiple classes, so result[4].
    • We only need to predict one record at a time, so we use CalcModelPredictionSingle method.
    1. Compile the C code:
    gcc -v -o program.out c_code.c -l catboostmodel -I /path/to/catboost/repo/catboost/catboost/libs/model_interface/ -L /path/to/catboost/repo/catboost/catboost/libs/model_interface/
    

    IMPORTANT: Make sure that no warning or error messages have been displayed.

    1. Now you can run it:

    IMPORTANT: Make sure the catboost model file is in the same path as program.out.

    ./program.out