I running into an issue where a query to our Solr search will return different values. However I am querying on the id, which is set to be the Unique Key Field.
So in the Solr Admin UI I will run a query like.
The relevant response info is below.
"response": {
"numFound": 1,
"start": 0,
"maxScore": 7.4537606,
"docs": [
{
"title": [
"ICARDA forced to move"
],
"moduleid_s": "58",
"id": "client1.com.58.1673",
"enddate_dt": "2015-09-25T23:59:00Z",
"url": "mysite.com/item.aspx?id=1673",
"startdate_dt": "2015-09-25T00:00:00Z",
Now running that query a few times will eventually lead to a different response.
"response": {
"numFound": 1,
"start": 0,
"maxScore": 7.453251,
"docs": [
{
"title": [
"ICARDA forced to move"
],
"moduleid_s": "58",
"id": "client1.com.58.1673",
"enddate_dt": "2015-09-25T23:59:00Z",
"url": "mysiteNewUrl.com/item.aspx?id=1673",
"startdate_dt": "2015-09-25T00:00:00Z",
Notice that the url is different.
With Debug Query checked. You can see that the different urls are in the GET_FIELDS
section.
Why/how can I get different information? I'm querying off the id which is marked an the unique field. From my understanding there should never be more than of those. Could this be a synchronization issue? I'm using the Solr admin UI query with a single core selected.
Is there was way to check if only one document with that id is in the Index?
UPDATE:
I ran a facet query and that unique returns 2
<lst name="facet_fields">
<lst name="id">
<int name="client1.com.58.1673">2</int>
vs one that isn't having the issue.
<lst name="facet_fields">
<lst name="id">
<int name="client1.com.58.163">1</int>
Is this right? Does this explain my issue in that there are duplicate documents, but if that's the case why aren't two documents getting returned instead of just different data?
Is this a SolrCloud setup or a single-collection one? If it is cloud, you most likely ended up with one record in two different cores. Possibly due to a router or an upgrade bug.
The good news, you should be able to find all the records that have this problem by doing facet.field=id, facet.mincount=2. Then you could delete/reinsert them for consistency.
And no, you should not be able to end up in this state, so there is either mis-configuration, upgrade failure or some forced commands to ignore the unique requirement.