Search code examples
solrdataimporthandler

Solr DataImportHandler - JOIN vs. separate entity


In Solr DIH data-config.xml, is it better to fetch as many fields as possible with the query in the main entity with JOIN like:

<entity name="Lists" 
        pk="l.list_id" 
        query="SELECT l.list_id AS id, l.user_id, lo.is_votable FROM lists l
                 INNER JOIN list_options lo ON lo.list_id = l.list_id">

or use a separate sub-entity like:

<entity name="Lists" 
        pk="l.list_id" 
        query="SELECT l.list_id AS id, l.user_id FROM lists l">

  <entity name="ListOptions" 
          query="SELECT lo.is_votable FROM list_options lo 
                   WHERE lo.list_id=${Lists.id}" />

</entity>

Solution

  • Few Pointers that may help you to decide :-

    • Sub entities fire a query for each of the records and hence would be slower in performance if you have a huge collection.
    • If you have a one to one mapping you can use the join so that you get all the fields with one query itself.
    • If you have multiple records for the root you would use the sub entity which would probably create a multivalued field. (You cant use a single join query as it would return multiple rows for the same document unless you want the behavior)