Search code examples
pythonverticavsql

Unable to use TransformFunction in vertica_sdk


What i am trying to do is execute vertica's string tokenizer example which is written in python.

Here is the link to the said example:https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/ExtendingVertica/UDx/TransformFunctions/Python/ExampleStringTokenizer.htm?TocPath=Extending Vertica|Developing%20User-Defined%20Extensions%20(UDxs)|Transform%20Functions%20(UDTFs)|Python%20API|_____2

This is what my code looks like

import vertica_sdk
class StringTokenizer(vertica_sdk.TransformFunction):
    """
    Transform function which tokenizes its inputs.
    For each input string, each of the whitespace-separated tokens of that
    string is produced as output.
    """
    def processPartition(self, server_interface, input, output):
        while True:
            for token in input.getString(0).split():
                output.setString(0, token)
                output.next()
            if not input.next():
                break


class StringTokenizerFactory(vertica_sdk.TransformFunctionFactory):
    def getPrototype(self, server_interface, arg_types, return_type):
        arg_types.addVarchar()
        return_type.addVarchar()
    def getReturnType(self, server_interface, arg_types, return_type):
        return_type.addColumn(arg_types.getColumnType(0), "tokens")
    def createTransformFunction(cls, server_interface):
        return StringTokenizer()

This is what im getting as my output when I execute the command.

create library sampy as '/home/dbadmin/udx/tokenize.py' language 'Python';

output

ROLLBACK 2175:  An error occurred when loading library file on node v_prmtest_node0001, message:
Failure in UDx RPC call InvokeCheckLibrary(): Error calling setupExecContext() in User Defined Object [] at [/scratch_a/release/svrtar19690/vbuild/vertica/OSS/UDxFence/PythonInterface.cpp:168], error code: 0, message: Error [/scratch_a/release/svrtar19690/vbuild/vertica/OSS/UDxFence/PythonInterface.cpp:204] function ['import']
(Python error type [<class 'AttributeError'>])
Traceback (most recent call last):
  File "/home/dbadmin/PRMTEST/v_prmtest_node0001_catalog/Libraries/02d86d505e41731d36151e9e9da31afc00b0000000561680/sampy_02d86d505e41731d36151e9e9da31afc00b0000000561680.py", line 2, in <module>
    class StringTokenizer(vertica_sdk.TransformFunction):
AttributeError: module 'vertica_sdk' has no attribute 'TransformFunction'

Solution

  • Transform function is present in vertica version 9 on wards.

    The reason why i wasn't able to execute my code on is because I was using Vertica version 8. Python User Defined Transform Functions are only supported in Vertica version 9 or above. UDTFs can be written for Vertica 8 in Java, C++ or R.