Search code examples
djangoapachesolrdjango-haystacksolr-multy-valued-fields

Fields in apache solr response are multivalued when they should be singular


I'm experiencing a problem with Apache Solr where I'm receiving fields wrapped in lists in JSON responses but they should be singular. Here is an exerpt from schema.xml, two example fields giving me a problem are django_ct and django_id:

  <fields>
    <!-- general -->
    <field name="id" type="string" indexed="true" stored="true" multiValued="false" required="true"/>
    <field name="django_ct" type="string" indexed="true" stored="true" multiValued="false"/>
    <field name="django_id" type="string" indexed="true" stored="true" multiValued="false"/>

Here is an example of how data is posted to Solr:

<doc>
    <field name="id">search.productcategory.3</field>
    <field name="gender">M</field>
    <field name="name">OBQYHSOQLWOUEHRMPSDI</field>
    <field name="text">M\nOBQYHSOQLWOUEHRMPSDI</field>
    <field name="django_id">3</field>
    <field name="django_ct">search.productcategory</field>
</doc>

And here is an example of the file stored in solr:

  "response": {
    "numFound": 1,
    "start": 0,
    "docs": [
      {
        "django_ct": [
          "search.productcategory"
        ],
        "name": [
          "Example"
        ],
        "text": [
          "Male\nExample"
        ],
        "id": "search.productcategory.2",
        "gender": [
          "Male"
        ],
        "django_id": [
          2
        ],
        "_version_": 1502081283634757600
      }
    ]
  }

What is causing these fields to be wrapped in lists? In the schema, the multiValuedattribute for these fields is set to false. Apart from creating the core and replacing schema.xml everything else is straight out of the box. I'm accessing Solr using Haystack (a Django plugin), the code expects to receive single values for these fields but is completely broken by this. Tracing back the problem it seems to be due to how Solr is configured.

Edit: Here are the complete contents of solr.log, all of this was logged after starting the server, running a couple of example queries had no output:

INFO  - 2015-05-27 08:38:12.563; [   ] org.eclipse.jetty.server.Server; jetty-8.1.10.v20130312
INFO  - 2015-05-27 08:38:12.586; [   ] org.eclipse.jetty.deploy.providers.ScanningAppProvider; Deployment monitor /Users/sampeka/solr-5.1.0/server/contexts at interval 0
INFO  - 2015-05-27 08:38:12.593; [   ] org.eclipse.jetty.deploy.DeploymentManager; Deployable added: /Users/sampeka/solr-5.1.0/server/contexts/solr-jetty-context.xml
INFO  - 2015-05-27 08:38:13.629; [   ] org.eclipse.jetty.webapp.StandardDescriptorProcessor; NO JSP Support for /solr, did not find org.apache.jasper.servlet.JspServlet
INFO  - 2015-05-27 08:38:13.682; [   ] org.apache.solr.servlet.SolrDispatchFilter; SolrDispatchFilter.init()WebAppClassLoader=1121453612@42d8062c

Solution

  • Got to the root of the problem. The problem was that solrconfig.xml wasn't configured correctly. By default the schemafactory class is set to ManagedIndexSchemaFactory which overrides the use of schema.xml. By changing the schemaFactory to class ClassicIndexSchemaFactory it forces the use of schema.xml and makes the schema immutable by API calls.