I've a python script that outputs the input:
#!/usr/bin/python3
import sys
if __name__ == '__main__':
i = 0
for line in sys.stdin:
print(i, line, end='')
sys.stdout.flush()
i += 1
This script is connected to ClickHouse with this config:
<functions>
<function>
<type>executable</type>
<name>test_function_python</name>
<return_type>String</return_type>
<argument><type>Int64</type></argument>
<format>TabSeparated</format>
<command>test_function.py</command>
<execute_direct>1</execute_direct>
</function>
</functions>
The script is called from ClickHouse:
SELECT test_function_python(number) AS x
FROM numbers(5)
┌─x───┐
│ 0 0 │
│ 1 1 │
│ 2 2 │
│ 3 3 │
│ 4 4 │
└─────┘
So far so good but I want to send multiple parameters to the UDF like this:
SELECT test_function_python(number, number + 3) AS x
FROM numbers(5)
So how do I get two arguments from Python code??
Use multiple <argument>
tag in function config (python_function.xml
in /etc/clickhouse-server
:
<functions>
<function>
<type>executable</type>
<name>test_function_python</name>
<return_type>String</return_type>
<argument><type>String</type></argument>
<argument><type>String</type></argument>
<format>TabSeparated</format>
<command>test_function.py</command>
<execute_direct>1</execute_direct>
</function>
</functions>
Then pass two arguments to python function:
SELECT test_function_python(number, number + 1) AS x
FROM numbers(10)
Query id: 562676d4-e7fb-4aec-86b4-fef41fec4864
┌─x─────────────────┐
│ 0: arg1=0 arg2=1 │
│ 1: arg1=1 arg2=2 │
│ 2: arg1=2 arg2=3 │
│ 3: arg1=3 arg2=4 │
│ 4: arg1=4 arg2=5 │
│ 5: arg1=5 arg2=6 │
│ 6: arg1=6 arg2=7 │
│ 7: arg1=7 arg2=8 │
│ 8: arg1=8 arg2=9 │
│ 9: arg1=9 arg2=10 │
└───────────────────┘
Python code (test_function.py
):
#!/usr/bin/python3
import sys
if __name__ == '__main__':
i = 0
for line in sys.stdin:
arg1, arg2 = line.split('\t')
print(f'{i}: arg1={arg1} arg2={arg2}', end='')
sys.stdout.flush()
i += 1