I have indexed records which contains a filed called birth-date , its not a stored field and also not a date field , it is a text field (solr.TextField) , with a "standard Tokenizers" . In solr 5 when I did a search query
q=*:*&fq=birth_date:1989/01/01
I got filtered 33 odd records but when I am doing the same in solr 8 (with the same property ) ,I get more than 6000 results .
Below is the schema of the field
<fieldtype name='birth_date' class='solr.TextField' sortMissingLast='true' omitNorms='true'\>
<analyzer\>
<tokenizer class='solr.StandardTokenizerFactory'/\>
</analyzer\>
</fieldtype\>
<field name='birth_date' type='birth_date' indexed='true' stored='false' multiValued='false' required='false'/\>
From solr 5 to 8 I don't see any change in solr.StandardTokenizerFactory but I did notice default "similarity" has changed , wanted to know why the search not giving same output
tied to hit q=*:*&fq=birth_date:1989/01/01
, we should get same number of response in solr 5 and solr 8
After debugging the input query saw that in solr5 the query searched was performing filter
"parsed_filter_queries": [
"PhraseQuery(birth_date:\"1989 01 01\")"
]
But in solr 8 it was searching as
"parsed_filter_queries":["birth_date:1989 birth_date:01 birth_date:01"]
only after adding double quotes in the fq it changed to phrase query
Another workaround was to use
<fieldtype name="birth_date" class="solr.TextField" sortMissingLast="true" omitNorms="true">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory" />
<filter class="solr.PatternReplaceFilterFactory" pattern="([^0-9])" replacement="" replace="all" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory" />
<filter class="solr.PatternReplaceFilterFactory" pattern="([^0-9])" replacement="" replace="all" />
</analyzer>
</fieldtype>
where the query eliminates all special character except numbers