I have 700.000 street names, 8111 municipality names, and 80333 locality postcodes. I would like to index all this information in solr. The user wants to search this information through an ajax autocomplete form. I have proved it with few data and the behavoir of the ajax autocomplete form it's ok.
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
</analyzer>
</fieldType>
The problem happens when loading all the data into solr
(It's okay to have a different document for each one (700.000 + 8111 + 80.333 documents) ??)
thanks for your time
I assume your municipalities, street names, and post codes are supposed to be autocompleted separately. In this case you'd use a separate solr core for each one.
Or should I use data input handler to load it faster??
DIH will be pretty fast, and as long as this information doesn't change very often, it should be fine to do it this way.
Can I concat string values from diferent columns of different tables with data input handler??
Yes; in data-config.xml
you give specific SQL
query and can use the database's native concatenation (e.g. ||
in oracle).