I have drill cluster, with 4 drillbits (drill 1.14). But I can not use dynamic UDF feature in cluster for some kind of reason. Every time, I was confronting with troubles.
Let me present 2 scenarios:
Scenario 1
Here is the config (configs are same for all drillbits):
drill.exec: {
cluster-id: "drill-test",
zk: {
connect: "vm29.local:2181,vm32.local:2181,vm39.local:2181",
root: "drill"
},
sys.store.provider.zk.blobroot: "hdfs://vm29.local:9000/apps/drill/pstore/",
http: {
enabled: true,
ssl_enabled: false,
port: 8047
session_max_idle_secs: 3600, # Default value 1hr
cors: {
enabled: true,
allowedOrigins: ["*"],
allowedMethods: ["GET", "POST", "HEAD", "OPTIONS"],
allowedHeaders: ["X-Requested-With", "Content-Type", "Accept", "Origin"],
}
}
}
drill.exec.udf: {
retry-attempts: 5,
directory: {
fs: "hdfs://vm29.local:9000/",
root: "/drill",
base: "/udf",
local: ${drill.exec.udf.directory.base}"/local",
staging: ${drill.exec.udf.directory.base}"/staging",
registry: ${drill.exec.udf.directory.base}"/registry",
tmp: ${drill.exec.udf.directory.base}"/tmp"
}
}
As You see, I use hdfs for UDF in that scenario.
When I put jar files into 'staging' folder, and run 'CREATE FUNCTION USING JAR' - it registers function successfully. BUT then I can use it only on drillbit where I registered it.
For example if I ran command in web UI in vm29 - I can use function only in vm29.
If in additional, I try to register jar in different drillbit - I get 'already registered' error - but can not use it.(not found error)
JAR files present in hdfs://vm29.local:9000/drill/udf/registry
and metadata in ZK registry.
Scenario 2
Config the same, only with difference - all drillbits use their local filesystem for UDF folder.
In that case - I can register/unregister function - but I can not use it on every drillbit (not found error). Jar files present in /UDF/registry folder, and metadata in zk registry - but do not work.
What am I doing wrong?
I can not found any description of step-by-step instruction, about using Dynamic UDF feature in cluster. Maybe You know one?
Thanks.
updated:
I just thought: I use web console for queries. Maybe it has difference - create function through web console or jdbc:zk connection? (I will test)
Cause & Results
This is a bug in drill 1.14
Was reported in Drill Jira
Fix with explanation: Drill GitHub repository
This is a regression since 1.13, we have opened a Jira ticket - https://issues.apache.org/jira/browse/DRILL-6762. Meanwhile, you can add custom udfs manually - https://drill.apache.org/docs/manually-adding-custom-functions-to-drill/.