I've got a Solr (version 4.10.3) cloud consisting of 3 Solr instances managed by Zookeeper. Each core is replicated from the current leader to the other 2 for redudancy.
Now to the problem. I need to index a datetime field from SQL as a TextField for wildcard queries (not the best solution, but a requirement non the less). On the core that does the import, everything looks like it should and the field contains values like: 2008.10.18 17:16:31.0
but the corresponding document (synced by the replicationhandler) on the other cores has values like: Sat Oct 18 17:16:31 CEST 2008
for the same field. I've been trying for a while to get to the bottom of this without success. The behavior of both the core and the cloud is as intended aside from this.
Does anyone have an idea of what im doing wrong?
The fieldType looks like this:
<fieldType name="stringD" class="solr.TextField" sortMissingLast="true" omitNorms="false">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="([-])" replacement="." replace="all" />
</analyzer>
</fieldType>
Here is a link to a screenshot showing the behavior in all its glory, the top part is from the core that did the full-import.
So my first answer goes to my first question here ;)
When initially setting this core up an import-query like this was used.
SELECT * FROM [TABLE]
and then the fields were mapped like this in the data-import-handler.
<field column="ENDTIME" name="ENDTIME" />
When the Solr started to convert the content of the [ENDTIME] (datetime2) column in SQL to a date, this was added to the import-query.
CAST(CAST(ENDTIME as datetime2(0)) as varchar(100)) as ENDTIMESTR
to force the correct format from SQL: 2008-10-18 17:16:31.0
.
The data-import-handler mapping was also changed to the following:
<field column="ENDTIMESTR" name="ENDTIME" />
Because of this, both [ENDTIME] and [ENDTIMESTR] came from SQL into the data-import-handler and somehow Solr was only able to use the correct field/fieldType on the core which initiated the full-import. When replicating the field to the other cores Solr seems to have looked at the original [ENDTIME] column (only existing in the data-import-handler during a full/delta-import, remember SELECT * FROM [TABLE]
). ENDTIME in the Solr-schema was a TextField all along.
SOLUTION: Removing the *
and instead explicitly define all fields in the full/delta-queries with [ENDTIME] looking like this CAST(CAST(ENDTIME as datetime2(0)) as varchar(100)) as ENDTIME
.
Everything now behaves as intended. I guess there's a bug in the data-import-handler mapping somewhere but my configuration wasn't really the best either.
Hope this can help someone else out on a slippery-Solr-slope!