Search code examples
hadoophivebigdatatalend

Talend HiveDB connection needs cloudera SerDe


I'm trying to connect Talend Open studio to Hive.

In Hive i have table with custom defined fields(from cloudera-twitter-example).
Open studio find table without problems, but if I try to Retrieve Schema i get such error: Error message

And in talend log i get:

!ENTRY org.talend.platform.logging 4 0 2015-09-21 10:49:12.375
!MESSAGE 2015-09-21 10:49:12,372 ERROR org.talend.commons.exception.CommonExceptionHandler  - java.lang.reflect.InvocationTargetException

!STACK 0
java.sql.SQLException: java.lang.reflect.InvocationTargetException
    at org.talend.metadata.managment.hive.EmbeddedHiveDataBaseMetadata.getColumns(EmbeddedHiveDataBaseMetadata.java:401)
    at org.talend.core.model.metadata.builder.database.manager.ExtractManager.getColumnsResultSet(ExtractManager.java:844)
    at org.talend.core.model.metadata.builder.database.manager.ExtractManager.extractColumns(ExtractManager.java:641)
    at org.talend.core.model.metadata.builder.database.manager.ExtractManager.returnMetadataColumnsFormTable(ExtractManager.java:521)
    at org.talend.core.model.metadata.builder.database.ExtractMetaDataFromDataBase.returnMetadataColumnsFormTable(ExtractMetaDataFromDataBase.java:224)
    at org.talend.repository.ui.wizards.metadata.table.database.DatabaseTableForm.pressRetreiveSchemaButton(DatabaseTableForm.java:1150)
    at org.talend.repository.ui.wizards.metadata.table.database.DatabaseTableForm.access$13(DatabaseTableForm.java:1121)
    at org.talend.repository.ui.wizards.metadata.table.database.DatabaseTableForm$4.widgetSelected(DatabaseTableForm.java:795)
    at org.eclipse.swt.widgets.TypedListener.handleEvent(TypedListener.java:248)
    at org.eclipse.swt.widgets.EventTable.sendEvent(EventTable.java:84)
    at org.eclipse.swt.widgets.Display.sendEvent(Display.java:4454)
    at org.eclipse.swt.widgets.Widget.sendEvent(Widget.java:1388)
    at org.eclipse.swt.widgets.Display.runDeferredEvents(Display.java:3799)
    at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:3409)
    at org.eclipse.jface.window.Window.runEventLoop(Window.java:832)
    at org.eclipse.jface.window.Window.open(Window.java:808)
    at org.talend.repository.metadata.ui.actions.metadata.AbstractCreateTableAction.handleWizard(AbstractCreateTableAction.java:139)
    at org.talend.repository.metadata.ui.actions.metadata.AbstractCreateTableAction$1$1.run(AbstractCreateTableAction.java:1088)
    at org.talend.repository.RepositoryWorkUnit.executeRun(RepositoryWorkUnit.java:93)
    at org.talend.core.repository.model.AbstractRepositoryFactory.executeRepositoryWorkUnit(AbstractRepositoryFactory.java:256)
    at org.talend.repository.localprovider.model.LocalRepositoryFactory.executeRepositoryWorkUnit(LocalRepositoryFactory.java:3227)
    at org.talend.core.repository.model.ProxyRepositoryFactory.executeRepositoryWorkUnit(ProxyRepositoryFactory.java:1996)
    at org.talend.repository.metadata.ui.actions.metadata.AbstractCreateTableAction$1.runInUIThread(AbstractCreateTableAction.java:1110)
    at org.eclipse.ui.progress.UIJob$1.run(UIJob.java:97)
    at org.eclipse.swt.widgets.RunnableLock.run(RunnableLock.java:35)
    at org.eclipse.swt.widgets.Synchronizer.runAsyncMessages(Synchronizer.java:136)
    at org.eclipse.swt.widgets.Display.runAsyncMessages(Display.java:3774)
    at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:3412)
    at org.eclipse.e4.ui.internal.workbench.swt.PartRenderingEngine$9.run(PartRenderingEngine.java:1151)
    at org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:332)
    at org.eclipse.e4.ui.internal.workbench.swt.PartRenderingEngine.run(PartRenderingEngine.java:1032)
    at org.eclipse.e4.ui.internal.workbench.E4Workbench.createAndRunUI(E4Workbench.java:148)
    at org.eclipse.ui.internal.Workbench$5.run(Workbench.java:636)
    at org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:332)
    at org.eclipse.ui.internal.Workbench.createAndRunWorkbench(Workbench.java:579)
    at org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:150)
    at org.talend.rcp.intro.Application.start(Application.java:183)
    at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:196)
    at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:134)
    at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:104)
    at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:380)
    at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:235)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:648)
    at org.eclipse.equinox.launcher.Main.basicRun(Main.java:603)
    at org.eclipse.equinox.launcher.Main.run(Main.java:1465)
    at org.eclipse.equinox.launcher.Main.main(Main.java:1438)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.talend.metadata.managment.hive.EmbeddedHiveDataBaseMetadata.getColumns(EmbeddedHiveDataBaseMetadata.java:370)
    ... 49 more
Caused by: java.lang.RuntimeException: MetaException(message:java.lang.ClassNotFoundException Class com.cloudera.hive.serde.JSONSerDe not found)
    at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:276)
    at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:256)
    at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:595)
    at org.apache.hadoop.hive.ql.metadata.Table.getAllCols(Table.java:612)
    ... 54 more
Caused by: MetaException(message:java.lang.ClassNotFoundException Class com.cloudera.hive.serde.JSONSerDe not found)
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:385)
    at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:274)
    ... 57 more

Most important message is

Caused by: java.lang.RuntimeException: MetaException(message:java.lang.ClassNotFoundException Class com.cloudera.hive.serde.JSONSerDe not found)
at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:276)
at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:256)
at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:595)
at org.apache.hadoop.hive.ql.metadata.Table.getAllCols(Table.java:612)
... 54 more
Caused by: MetaException(message:java.lang.ClassNotFoundException Class com.cloudera.hive.serde.JSONSerDe not found)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:385)
at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:274)
... 57 more

This message shows that Talend Open Studio for big data cannot find SerDe.
But I placed hive-serede on all hadoop cluster knots and Informatica PowerCenter can get schema information from hive.
Also when i use Hive CLI on knots or Hue to run queries over HIve it works without any problems and extra configurations like

ADD JAR

How can i make Talend Open studio working with my Hive tables?

  • Hadoop: CDH 5.4.3
  • Talend Open studio 6.0.1
  • Hive 1.1.0+cdh5.4.3+151

Solution

  • I solved this issue. By default Talend Open Studio is trying to work with Hive metastore directly.
    So called embedded connection. (Metastore port is 9093)
    But in connection setting i saw port 10000 that pointed me to Hiveserver2(Thrift).
    After switching to standalone connection in began working.