Search code examples
pythonclickhouse

How to send multiple arguments to Executable UDF in ClickHouse?


I've a python script that outputs the input:

#!/usr/bin/python3

import sys

if __name__ == '__main__':
    i = 0
    for line in sys.stdin:
        print(i, line, end='')
        sys.stdout.flush()
        i += 1

This script is connected to ClickHouse with this config:

<functions>
    <function>
        <type>executable</type>
        <name>test_function_python</name>
        <return_type>String</return_type>
        <argument><type>Int64</type></argument>
        <format>TabSeparated</format>
        <command>test_function.py</command>
        <execute_direct>1</execute_direct>
    </function>
</functions>

The script is called from ClickHouse:

SELECT test_function_python(number) AS x        
FROM numbers(5)                                 
                                                
                                                
┌─x───┐                                         
│ 0 0 │                                         
│ 1 1 │                                         
│ 2 2 │                                         
│ 3 3 │                                         
│ 4 4 │                                         
└─────┘                                         

So far so good but I want to send multiple parameters to the UDF like this:

SELECT test_function_python(number, number + 3) AS x        
FROM numbers(5)                                 

So how do I get two arguments from Python code??


Solution

  • Use multiple <argument> tag in function config (python_function.xml in /etc/clickhouse-server:

    <functions>                                     
        <function>                                  
            <type>executable</type>                 
            <name>test_function_python</name>       
            <return_type>String</return_type>       
            <argument><type>String</type></argument>
            <argument><type>String</type></argument>
            <format>TabSeparated</format>           
            <command>test_function.py</command>     
            <execute_direct>1</execute_direct>      
        </function>                                 
    </functions>                                    
    

    Then pass two arguments to python function:

    SELECT test_function_python(number, number + 1) AS x   
    FROM numbers(10)                                       
                                                           
    Query id: 562676d4-e7fb-4aec-86b4-fef41fec4864         
                                                           
    ┌─x─────────────────┐                                  
    │ 0: arg1=0 arg2=1  │                                  
    │ 1: arg1=1 arg2=2  │                                  
    │ 2: arg1=2 arg2=3  │                                  
    │ 3: arg1=3 arg2=4  │                                  
    │ 4: arg1=4 arg2=5  │                                  
    │ 5: arg1=5 arg2=6  │                                  
    │ 6: arg1=6 arg2=7  │                                  
    │ 7: arg1=7 arg2=8  │                                  
    │ 8: arg1=8 arg2=9  │                                  
    │ 9: arg1=9 arg2=10 │                                  
    └───────────────────┘                                  
    

    Python code (test_function.py):

    #!/usr/bin/python3
    
    import sys
    
    if __name__ == '__main__':
        i = 0
        for line in sys.stdin:
            arg1, arg2 = line.split('\t')
            print(f'{i}: arg1={arg1} arg2={arg2}', end='')
            sys.stdout.flush()
            i += 1