Search code examples
sqlxmlsolrdata-importucanaccess

Solr dataimport and SQL with xml escape


I tried the following SQL in entity tag of dataimport config file:

<entity name="Page" dataSource="a1" query="SELECT &amp;apos;26484-&amp;apos;&amp;amp;`book`.id&amp;amp;&amp;apos;-&amp;apos;&amp;amp;`book`.page&amp;amp;&amp;apos;-&amp;apos;&amp;amp;`book`.part AS PageID, `book`.id AS pid, `book`.nass AS Content, `book`.part AS Part, `book`.page AS PageNum FROM `book` ORDER BY `book`.id, `book`.page">

The Sql query contains characters should be escaped in xml, ' and &. However, I recive the following error in the log:

 DocBuilder     Exception while processing: Page document : SolrInputDocument(fields: []):org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to execute query: SELECT '26484-'&`book`.id&'-'&`book`.page&'-'&`book`.part AS PageID,​ `book`.id AS pid,​ `book`.nass AS Content,​ `book`.part AS Part,​ `book`.page AS PageNum FROM `book` ORDER BY `book`.id,​ `book`.page Processing Document # 1 

and

DataImporter    Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to execute query: SELECT '26484-'&`book`.id&'-'&`book`.page&'-'&`book`.part AS PageID,​ `book`.id AS pid,​ `book`.nass AS Content,​ `book`.part AS Part,​ `book`.page AS PageNum FROM `book` ORDER BY `book`.id,​ `book`.page Processing Document # 1 

I use jdbc driver.

When I tried to remove concatination from the SQL query so, removing the escaped characters, I have not got this error, but I have got solrWriter warning about duplicate Unique key. and I understand this warning. However, How could I able to write use the Sql query shown above?


Solution

    1. There is no need to escape single quotes, you can put ' directly into the query.

    2. A & is required for every escaped character. So you will need to replace &amp;apos; with &amp;&apos;.

    The fact that corrupts your query should be (2).

    <entity name="Page" dataSource="a1" query="SELECT &amp;'26484-&amp;'&amp;&amp;`book`.id&amp;&amp;&amp;'-&amp;'&amp;&amp;`book`.page&amp;&amp;&amp;'-&amp;'&amp;&amp;`book`.part AS PageID, `book`.id AS pid, `book`.nass AS Content, `book`.part AS Part, `book`.page AS PageNum FROM `book` ORDER BY `book`.id, `book`.page">