I have installed Cloudera CDH QuickStart VM 5.5, and I'm running a Sqoop action in my Oozie workflow. I encountered an error that says MySQL JDBC driver is missing and I came across to a SO answer here that says the mysql-connector-java.jar should be placed in Oozie's HDFS shared lib path, under sqoop
path.
When I browse the Oozie's HDFS shared lib path, however, I've noticed two sqoop
subdirectories to copy the jar.
/user/oozie/share/lib/sqoop
and
/user/oozie/share/lib/lib_20151118030154/sqoop
Aside from sqoop
, hive
, pig
, distcp
, and mapreduce-streaming
paths also exist on both lib
and lib/lib_20151118030154
.
So the question is: where do I place my connector jar: on the first or the second one?
What's the difference (or difference of purpose) of these two paths in relation to jars of sqoop
, hive
, pig
, distcp
, and mapreduce-streaming
for Oozie?
The lib_20151118030154
sub-dir would be the current version of the ShareLibs, as of 18-NOV-2015. The versioning allows you to make updates without stopping the Oozie service -- check the documentation here.
In other words: the Oozie service keeps in memory a list of the JARs in each ShareLib (based on what was present for the latest version at boot time), so that adding a JAR will not make a difference until (a) you stop/restart the service or (b) you resync the service as explained in the doc above.