Search code examples
indexingsolrlucenerelationshipdenormalization

How to index typed relationships in Solr/Lucene


There are two entities which can relate to each other in a typed way. How must I denormalise and index these tables to search in Solr/Lucene for entity A by a specific entity B and relationship type?

For example let's say there are entities organisation and person linked by the following table:

-------------------------------------------
|  link_type | person      | organisation |
-------------+-------------+---------------
|  Founder   | Elon Musk   | SpaceX       |
|------------+-------------+--------------|
|  Chairman  | Elon Musk   | SolarCity    |
|------------+-------------+--------------|
|  Founder   | Lyndon Rive | SolarCity    |
|------------+-------------+--------------|
|  Founder   | Elon Musk   | Tesla        |
-------------------------------------------

I'd like to be able to search for all organisations which Elon Musk has founded. The expected document result would be:

[SpaceX, Tesla]

Solution

  • Two possible solutions; index each row as a single document, then just query for link_type:Founder AND person:Elon\ Musk.

    The second option is to use a dynamic, multivalued field with the link_type in the name for each organization:

    link_type_Founder:Elon\ Musk
    

    The returned documents are the organizations where the person has the given link_type.