Search code examples
solrhierarchical-datafaceted-search

How to create Solr schema for hierarchical facet by splitting data into multiple fields at index time


I want to implement Solr hierarchical facet for my application where there is 2 level hierarchy between Category and SubCategory. I want to use a solution mentioned on http://wiki.apache.org/solr/HierarchicalFaceting#Pivot_Facets link.

The flattened data will be as below:

Doc#1: NonFic > Law
Doc#2: NonFic > Sci
Doc#3: NonFic > Sci > Phys

And this data should be split into a separate field for each level of the hierarchy at index time. Same as below.

Indexed Terms

Doc#1: category_level0: NonFic; category_level1: Law
Doc#2: category_level0: NonFic; category_level1: Sci
Doc#3: category_level0: NonFic; category_level1: Sci, category_level2:Phys

So can anyone please suggest ways to implement this? How do I define Solr schema to achieve this? I could not find any reference for splitting data as mentioned above at Index time.

Thanks,

Priyanka


Solution

  • Do you need to display those individual fields as part of the documents returned? In which case you need those split values in 'stored' version of the field. If you only need to have them during search or during faceting, you can ignore the 'stored' form and concentrate on 'indexed' form.

    In either case, if you need to split one field into several, you can do that with copyField or with UpdateRequestProcessor.

    With copyField, the 'stored' form will be the same for all fields, but you can have different processors for each field, picking different part of the hierarchy for the 'indexed' part.

    With UpdateRequestProcessor, you can write a custom one that takes one field and then spits out several fields, each with only its part of the path. You can do a custom one or do a couple of field copies and then different Regex processor on each field.