Search code examples
solrdataimporthandler

Solr loading information without data import handler


I have 700.000 street names, 8111 municipality names, and 80333 locality postcodes. I would like to index all this information in solr. The user wants to search this information through an ajax autocomplete form. I have proved it with few data and the behavoir of the ajax autocomplete form it's ok.

 <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.StopFilterFactory"
            ignoreCase="true"
            words="stopwords.txt"
            enablePositionIncrements="true"
            />
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.StopFilterFactory"
            ignoreCase="true"
            words="stopwords.txt"
            enablePositionIncrements="true"
            />
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
  </analyzer>
</fieldType>

The problem happens when loading all the data into solr

  • How should I load the information to the solr server (I'm in a grails app and I need to load instances that have the information without data input handler) Today I have been many hours today to do it and finally grails console crashed :( --> should I use a grails script instead of doing a service and executing it with grails console??
  • Or should I use data input handler to load it faster?? Can I concat string values from diferent columns of different tables with data input handler??

(It's okay to have a different document for each one (700.000 + 8111 + 80.333 documents) ??)

thanks for your time


Solution

  • I assume your municipalities, street names, and post codes are supposed to be autocompleted separately. In this case you'd use a separate solr core for each one.

    Or should I use data input handler to load it faster??

    DIH will be pretty fast, and as long as this information doesn't change very often, it should be fine to do it this way.

    Can I concat string values from diferent columns of different tables with data input handler??

    Yes; in data-config.xml you give specific SQL query and can use the database's native concatenation (e.g. || in oracle).