Search code examples
monetdb

Monetdbe Python UDF


Given that monetdbe is a Python package, I'm optimistic that Python user-defined-functions are possible, but I haven't been able to find an example. I've tried this:

drop function every_other_letter;

create function every_other_letter(s string)
returns string
language python {
     letters = "".join(s[::2])
     return letters
};

select every_other_letter('foobarbaz');

I get this error:

ParseException:SQLparser:42000!Internal error while compiling statement: TypeException:user.p2_1[4]:'pyapi3.eval' library error in: Embedded Python 3 has not been enabled. Start server with --set embedded_py=3

Is there any way to set these flags in the embedded version?


Solution

  • The LANGUAGE PYTHON UDF's are a nice development feature in MonetDB's server installation but this feature requires an additional Python module to be loaded. And there is currently no way to configure monetdbe to load the required python module.

    However assuming you have performance requirement for some production setting that are not met with the out-of-the-box SQL toolset in monetdbe, it makes more sense to implement a custom UDF extension written in C/C++. In regular MonetDB's server installation, the database server mserver5 can load an arbitrary extension module using the --loadmodule=<module> command option. But there is no equivalent monetdbe_option as of yet.

    You might consider adding a feature request for this on monetdbe-python's github repository.

    However there seems to exist a functioning undocumented workaround for adding UDF extensions to monetdbe. During its initialization, monetdbe attempts to load a set of hard coded modules. One of those is a module named "udf". You can create your own implementation of this module and load it into monetdbe.

    Creating a native UDF extension is outside of the scope of this question and answer but there exist a nice up-to-date tutorial for writing UDF extensions for MonetDB here. Following the steps described in that tutorial, you end up with a SQL function revstr which has a user defined native implementation. The following Python script demonstrate its use:

    from monetdbe import connect
    
    with connect(autocommit=True) as con:
        cur = cur.execute("select revstr('abcde')")
        result = cur.fetchall()
        print(result)
    

    Make sure that the library containing your UDF extension is in the search path of the dynamic linker:

    LD_LIBRARY_PATH=<path to directory containing lib_udf> python test.py