I am getting more results with using an exact search than using a fuzzy search. This issue arises when I have a search input with a hyphen.
My index has a searchable field called productDescription
with the Lucene Standard Analyzer. The index field's details are:
{
"name": "productDescription",
"type": "Edm.String",
"searchable": true,
"filterable": true,
"retrievable": true,
"stored": true,
"sortable": false,
"facetable": false,
"key": false,
"indexAnalyzer": null,
"searchAnalyzer": null,
"analyzer": "standard.lucene",
"normalizer": null,
"dimensions": null,
"vectorSearchProfile": null,
"vectorEncoding": null,
"synonymMaps": []
}
I am testing out my search with the index's Search Explorer in the Azure portal before I test anything out in Java.
Here is the query that works as expected.
{
"search": "agent resistant",
"queryType": "full",
"count": true
}
50 results.
{
"search": "agent~ resistant~",
"queryType": "full",
"count": true
}
100 results. This is expected with adding fuzzy search.
The query that I am having trouble with is:
{
"search": "ultra-wise",
"queryType": "full",
"count": true
}
With this, I will get 65 results.
Now when I change it to add fuzzy search,
{
"search": "ultra-wise~",
"queryType": "full",
"count": true
}
I now get 0 results. I have also tried ultra~ wise~
and ultra wise~
which do return results, but my users will be inputting the dash.
Is there something else I should be accounting for when using hyphens?
When you use hyphen(-
) without fuzzy search in standard analyzer it will split the words and search each one.
When you search "ultra-wise"
it searches for the word both ultra
and wise
so, you are getting results 65 results.
You can see it by adding highlight in the query search.
Below is the sample search query results.
query
{
"search": "ultra-wise",
"queryType": "full",
"count": true,
"highlight": "productDescription"
}
results.
{
"@odata.context": "https://axsxnsk.search.windows.net/indexes('azureblob-indexjsons')/$metadata#docs(*)",
"@odata.count": 3,
"value": [
{
"@search.score": 0.6986222,
"@search.highlights": {
"productDescription": [
"<em>Ultra</em> sharp kitchen knives with ergonomic handles."
]
},
"productDescription": "Ultra sharp kitchen knives with ergonomic handles.",
"AzureSearch_DocumentKey": "aHR0canNvbjsxOQ2",
"metadata_storage_name": "output.json"
},
{
"@search.score": 0.62191015,
"@search.highlights": {
"productDescription": [
"Noise cancelling over ear headphones with <em>ultra</em> clear sound."
]
},
"productDescription": "Noise cancelling over ear headphones with ultra clear sound.",
"AzureSearch_DocumentKey": "aHR0cHM6Lyjs10",
"metadata_storage_name": "output.json"
},
{
"@search.score": 0.593085,
"@search.highlights": {
"productDescription": [
"<em>Ultra</em> fast USB C charging cable for modern devices."
]
},
"productDescription": "Ultra fast USB C charging cable for modern devices.",
"AzureSearch_DocumentKey": "aHR0cHM6Ly9jsxMA2",
"metadata_storage_name": "output.json"
}
]
}
If observe here count is 3 and i got results having the word ultra
and no results for wise
since in my data, it is not present.
But, when you do fuzzy search the search is done on combined words (ultrawise
).
See below.
{
"search": "ultra-wise~",
"queryType": "full",
"count": true,
"highlight": "productDescription"
}
Now count is 2 having fuzzy search result.
{
"@odata.context": "https://dkdnd.search.windows.net/indexes('azureblob-indexjsons')/$metadata#docs(*)",
"@odata.count": 2,
"value": [
{
"@search.score": 0.64547044,
"@search.highlights": {
"productDescription": [
"<em>Ultrawise</em> cleaning agent for all surfaces."
]
},
"productDescription": "Ultrawise cleaning agent for all surfaces.",
"AzureSearch_DocumentKey": "aHR0cHM6Ly92amdbjsx0",
"metadata_storage_name": "output.json"
},
{
"@search.score": 0.5647866,
"@search.highlights": {
"productDescription": [
"Premium high resolution <em>ultrawide</em> monitor for gaming."
]
},
"productDescription": "Premium high resolution ultrawide monitor for gaming.",
"AzureSearch_DocumentKey": "aHR0cHM6Ly92amdzNvbjs50",
"metadata_storage_name": "output.json"
}
]
}
In you documents there no word like ultrawise
so zero results.
The same is for me if i do search on High-strength
{
"search": "High-strength",
"queryType": "full",
"count": true,
"highlight": "productDescription"
}
Got results for each word.
{
"@odata.context": "https://qwqwoq.search.windows.net/indexes('azureblob-indexjsons')/$metadata#docs(*)",
"@odata.count": 2,
"value": [
{
"@search.score": 1.9095788,
"@search.highlights": {
"productDescription": [
"<em>High</em>-<em>strength</em> adhesive for industrial applications."
]
},
"productDescription": "High-strength adhesive for industrial applications.",
"AzureSearch_DocumentKey": "aHR0cHM6Ly92amdzYjsy0",
"metadata_storage_name": "output.json"
},
{
"@search.score": 0.7261542,
"@search.highlights": {
"productDescription": [
"Premium <em>high</em> resolution ultrawide monitor for gaming."
]
},
"productDescription": "Premium high resolution ultrawide monitor for gaming.",
"AzureSearch_DocumentKey": "aHR0cHM6Ly92ambjs50",
"metadata_storage_name": "output.json"
}
]
}
when i do fuzzy search zero results since i don't have Highstrength
kind of word.
{
"search": "High-strength~",
"queryType": "full",
"count": true,
"highlight": "productDescription"
}
Results
{
"@odata.context": "https://dflckds.search.windows.net/indexes('azureblob-indexjsons')/$metadata#docs(*)",
"@odata.count": 0,
"value": []
}
If you want to do fuzzy search on both the words separately you need to do query giving ultra~ wise~
or to match exact ultra-wise
word
search by giving \"ultra-wise\"
.
Whatever the user give split it and do fuzzy search.