Search code examples
javasolrsquirrel-sqlemc

Solr import command not working


I need your help to solve a big problem. I have to index with solr some datas from an EMC xDB source. I configured xDB to be requested like an RDBMS. Then i configured Squirrel to request xDB. It works and when for exemple i try:

 Select * from books

I configured solr using the same driver and the same connection string as i used for squirrel.

Here is my data-config.xml:

<dataConfig>
<dataSource type="JdbcDataSource" driver="com.emc.ia.xdbjdbc.Driver" url="jdbc:xdbeas://localhost:1235/MaDatabase" user="*****" password="*****"/>
<document name="books">
    <entity name="book" query="select * from books">
        <field column="author" name="author" />
        <field column="price" name="price" />
        <field column="name" name="name" />
        <field column="description" name="description" />
        <field column="title" name="title" />
        <field column="publish_date" name="publish_date" />         
    </entity>
</document>
</dataConfig>

solrconfig.xml

<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
    <lst name="defaults">
        <str name="config">C:\solr\solr-5.2.1\server\solr\jcg\conf\data-config.xml</str>
    </lst>
</requestHandler>

My xDB BOOKS-MetaData.xml:

<?xml version="1.0" encoding="UTF-16"?>
<TABLEMETADATA TABLENAME="books">
  <RECORD-COUNT>2</RECORD-COUNT>
  <ROOT-ELEMENT>books</ROOT-ELEMENT>
  <ROW-ELEMENT>book</ROW-ELEMENT>
  <DATATYPES>
     <BOOKS-TYPE>
      <NAME>author</NAME>
      <COLNO>1</COLNO>
      <DATATYPE>VARCHAR2(75)</DATATYPE>
   </BOOKS-TYPE>
   <BOOKS-TYPE>
      <NAME>title</NAME>
      <COLNO>2</COLNO>
      <DATATYPE>VARCHAR2(75)</DATATYPE>
    </BOOKS-TYPE>
    <BOOKS-TYPE>
      <NAME>genre</NAME>
      <COLNO>3</COLNO>
      <DATATYPE>VARCHAR2(25)</DATATYPE>
   </BOOKS-TYPE>
   <BOOKS-TYPE>
      <NAME>price</NAME>
      <COLNO>4</COLNO>
      <DATATYPE>VARCHAR2(25)</DATATYPE>
   </BOOKS-TYPE>
   <BOOKS-TYPE>
      <NAME>publish_date</NAME>
      <COLNO>5</COLNO>
      <DATATYPE>DATETIME</DATATYPE>
   </BOOKS-TYPE>
   <BOOKS-TYPE>
      <NAME>description</NAME>
      <COLNO>6</COLNO>
      <DATATYPE>VARCHAR2(350)</DATATYPE>
    </BOOKS-TYPE>
  </DATATYPES>
</TABLEMETADATA>

A part of the solr responses:

"verbose-output": [
"entity:book",
[
  "document#1",
  [
    "query",
    "select * from books",
    "time-taken",
    "0:0:0.114",
    "EXCEPTION",
    "org.apache.solr.handler.dataimport.DataImportHandlerException: Error reading data from database Processing Document # 1

When i check solr logs i have 3 exceptions. Number 1:

getNext() failed for query 'select * from books':org.apache.solr.handler.dataimport.DataImportHandlerException: Error reading data from database Processing Document # 1
at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:70)
at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.getARow(JdbcDataSource.java:398)
at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.access$600(JdbcDataSource.java:296)
at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator$1.next(JdbcDataSource.java:336)
at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator$1.next(JdbcDataSource.java:328)
at org.apache.solr.handler.dataimport.EntityProcessorBase.getNext(EntityProcessorBase.java:133)
at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:74)
at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:243)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:475)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
at org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:185)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2064)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:654)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:450)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:227)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:196)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:497)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.sql.SQLFeatureNotSupportedException: Method com.emc.ia.xdbjdbc.xdb.XdbResultSet.getObject-columnLabel is not yet implemented.
at com.emc.ia.xdbjdbc.Driver.notImplemented(Driver.java:745)
at com.emc.ia.xdbjdbc.xdb.XdbResultSet.getObject(XdbResultSet.java:659)
at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.getARow(JdbcDataSource.java:358)
... 39 more

Number 2:

Exception while processing: book document : SolrInputDocument(fields: []):org.apache.solr.handler.dataimport.DataImportHandlerException: Error reading data from database Processing Document # 1
at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:70)
at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.getARow(JdbcDataSource.java:398)
at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.access$600(JdbcDataSource.java:296)
at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator$1.next(JdbcDataSource.java:336)
at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator$1.next(JdbcDataSource.java:328)
at org.apache.solr.handler.dataimport.EntityProcessorBase.getNext(EntityProcessorBase.java:133)
at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:74)
at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:243)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:475)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
at org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:185)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2064)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:654)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:450)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:227)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:196)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:497)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.sql.SQLFeatureNotSupportedException: Method com.emc.ia.xdbjdbc.xdb.XdbResultSet.getObject-columnLabel is not yet implemented.
at com.emc.ia.xdbjdbc.Driver.notImplemented(Driver.java:745)
at com.emc.ia.xdbjdbc.xdb.XdbResultSet.getObject(XdbResultSet.java:659)
at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.getARow(JdbcDataSource.java:358)
... 39 more

And the last one:

Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: Error reading data from database Processing Document # 1
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:270)
at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
at org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:185)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2064)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:654)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:450)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:227)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:196)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:497)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: Error reading data from database Processing Document # 1
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:416)
at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
... 29 more
Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: Error reading data from database Processing Document # 1
at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:70)
at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.getARow(JdbcDataSource.java:398)
at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.access$600(JdbcDataSource.java:296)
at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator$1.next(JdbcDataSource.java:336)
at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator$1.next(JdbcDataSource.java:328)
at org.apache.solr.handler.dataimport.EntityProcessorBase.getNext(EntityProcessorBase.java:133)
at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:74)
at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:243)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:475)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
... 31 more
Caused by: java.sql.SQLFeatureNotSupportedException: Method com.emc.ia.xdbjdbc.xdb.XdbResultSet.getObject-columnLabel is not yet implemented.
at com.emc.ia.xdbjdbc.Driver.notImplemented(Driver.java:745)
at com.emc.ia.xdbjdbc.xdb.XdbResultSet.getObject(XdbResultSet.java:659)
at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.getARow(JdbcDataSource.java:358)
... 39 more

Thank you in advance for your help because i really dont understand why it works with squirrel and not with solr.

Question edit in order to reply to @Cheffe

It was indeed a xDB driver problem due to the getObject() method. I looked at squirrel code and it works because ituses the indice of the column and not its name.

int idx = _columnIndices != null ? _columnIndices[i] : i + 1;
        try
        {
            final int columnType = _rsmd.getColumnType(idx);
            //final String columnClassName = _rsmd.getColumnClassName(idx);
            switch (columnType)
            {
                case Types.NULL:
                    row[i] = null;
                    break;

                // TODO: When JDK1.4 is the earliest JDK supported
                // by Squirrel then remove the hardcoding of the
                // boolean data type.
                case Types.BIT:
                case 16:

// case Types.BOOLEAN: row[i] = _rs.getObject(idx); . . .

Anyway it works with

convertType="true"

Solution

  • After checking the Solr sources: It appears that the Drivers of that database are probably not fit to be run with Solr's DIH. Unfortunately that database is proprietary therefore I cannot try it out.

    The exceptions are thrown from within the driver of your vendor. All three root causes are resident there. You can see that when you scroll down to the very bottom of the stack trace. All tell of these stacktraces tell you

    Method xxx.getObject-columnLabel is not yet implemented.

    This means: You database vendor EMC did not implement this method in their JDBC driver. If you do not want to get your hands dirty, you should file them a Bug Report or Feature request that you need this feature implemented.


    One shot you may have is that the last line in the stacktrace that notes Solr's sources lead to the internal getARow method

    private Map<String, Object> getARow() {
      if (resultSet == null)
        return null;
    
      Map<String, Object> result = new HashMap<>();
      for (String colName : colNames) {
        try {
          if (!convertType) {
          Object value = resultSet.getObject(colName); // <-- here the exception takes of
          if (value instanceof BigDecimal || value instanceof BigInteger) {
            result.put(colName, value.toString());
          } else {
            result.put(colName, value);
          }
          continue;
        }
    // more code there, but left out
    

    You can get around the invocation of resultSet.getObject via setting a configuration parameter of your dataimporthandler. Namely this is

    convertType (default: false) – Applies an additional conversion from the field type returned by the database to the field type defined in the schema.xml. The default value seems to be safer, because it does not cause extra, magical conversion. However, in special cases (eg BLOB fields), that conversion is one of the ways of solving the problem. [Reference]

    You need to do this in your data-config.xml

    <dataConfig>
    <dataSource convertType="true" type="JdbcDataSource" driver="com.emc.ia.xdbjdbc.Driver" url="jdbc:xdbeas://localhost:1235/MaDatabase" user="*****" password="*****"/>
    

    However I cannot tell you if this succeeds with your database and its' driver. Probably this just leads to another exception. In that case you will need to fallback to the solution

    1. Ask you vendor to do it
    2. Write a custom data import routine