I'm trying to upload an XML document (RSS feed) to Solr. I call this to index the file
curl "http://localhost:8983/solr/1-3/update?commit=true&commitWithin=10000&tr=updateXml.xsl&literalsOverride=true&literal.client_uid=3" -H "Content-Type: text/xml" --data-binary @myfile.xml
The core name is 1-3
, it processes the file correctly and I can search all the products and fields I have specified in the schema.xml when I don't include the client_uid
in the schema or make it an optional
field.
This is an extra field that I'd like to include in the URL (documents on their own don't have this value)
<field name="client_uid" type="long" indexed="true" stored="true" multiValued="false" required="true"/>
My file has around 22,000 documents in it. I try to supply the value via the literal.client_uid
parameter in the URL but I'm getting this error.
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">400</int><int name="QTime">3007</int></lst><lst name="error"><str name="msg">[doc=117755] missing required field: client_uid</str><int name="code">400</int></lst>
</response>
I'm using Solr 5.4.0
What is wrong?
Figured it out. As @Karsten R. explained it won't work because the request handlers are different and the UpdateRequestHandler doesn't support it.
I have decided to use an updateRequestProcessorChain
(in solrconfig.xml) and created a .jar
library with a new UpdateRequestProcessorFactory
class which I included in the processor chain.
Snapshot from solrconfig.xml
<updateRequestProcessorChain name="mychain">
<processor class="mypackage.solr.MyNewProcessorFactory"/>
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>`
Code for the Solr plugin (this .jar
file goes into lib
folder where solr.xml
is - you need to create the lib
folder yourself first time)
package dreamagility.solr;
import java.io.IOException;
import org.apache.solr.common.SolrInputDocument;
import org.apache.solr.common.params.SolrParams;
import org.apache.solr.request.SolrQueryRequest;
import org.apache.solr.response.SolrQueryResponse;
import org.apache.solr.update.AddUpdateCommand;
import org.apache.solr.update.processor.UpdateRequestProcessor;
import org.apache.solr.update.processor.UpdateRequestProcessorFactory;
/**
* Created by Daniel on 06/01/2016.
*
* Adds extra tags to each document to be able to filter based on the client id it belongs to
* This is something that is not included as a part of the feed which is indexed but it will be supplied with
* the URL as a parameter.
*/
public class MyNewProcessorFactory extends UpdateRequestProcessorFactory {
@Override
public UpdateRequestProcessor getInstance(SolrQueryRequest solrQueryRequest, SolrQueryResponse solrQueryResponse, UpdateRequestProcessor updateRequestProcessor) {
return new MyNewProcessorFactory(solrQueryRequest, solrQueryResponse, updateRequestProcessor);
}
}
class MyNewProcessorFactoryextends UpdateRequestProcessor {
private SolrQueryRequest solrQueryRequest;
private SolrQueryResponse solrQueryResponse;
private UpdateRequestProcessor updateRequestProcessor;
public MyNewProcessorFactory(SolrQueryRequest _solrQueryRequest, SolrQueryResponse _solrQueryResponse, UpdateRequestProcessor _updateRequestProcessor) {
super(_updateRequestProcessor);
this.solrQueryRequest = _solrQueryRequest;
this.solrQueryResponse = _solrQueryResponse;
this.updateRequestProcessor = _updateRequestProcessor;
}
@Override
public void processAdd(AddUpdateCommand cmd) throws IOException {
SolrInputDocument document = cmd.getSolrInputDocument();
SolrParams params = this.solrQueryRequest.getParams();
int clientId = params.getInt("clientId");
document.addField("client_uid", clientId);
super.processAdd(cmd);
}
}
And my HTTP call looks like this
curl "http://localhost:8983/solr/1-3/update?commit=true&commitWithin=10000&tr=updateXml.xsl&overwrite=true&clientId=3update.chain=mychain" -H "Content-Type: text/xml" --data-binary @myfile.xml