I have an elastic index that keeps some items. The structure is below.
public class items
{
public string item_no { get; set; }
public string category { get; set; }
public int campaign { get; set; }
public int in_stock { get; set; }
// Next properties only include [a-z0-9]. Not any other characters
public string score_item_no { get; set; }
public string score_group_one { get; set; }
public string score_group_two { get; set; }
public string score_description { get; set; }
public string score_all_fields { get; set; } /* score_item_no + score_group_one + score_group_two + score_description and something else */
}
public class ClassForScore
{
public int id { get; set; }
public string item_no { get; set; }
}
I must filter useless records from the result. I've decided to use the score option and made a function to calculate the average score. So first I call the elasticsearch for scores, and then I call with minscore parameter. I could not find any solution for filtering useless resultsAny advice for this? This is question one.
And the second one: The first score call returns 7 records. Every record has different scores. For example, the first record has 1100 scores. But I would like to know where this 1100 comes from? 1000 from score_item_no and 100 from score_group_one, or 500 from score_group_one matching 5 parts and 500 of them score_group_two match 5 parts and 100 from score_description matching 2 parts. Is there a way to find score detail?
QueryContainer queryContainsAnd = new WildcardQuery() { Field = "score_all_fields", Value = "*" + mykeyword + "*" };
QueryContainer queryEqualsOr = new TermQuery() { Field = "category", Value = *something1* };
queryEqualsOr |= new TermQuery() { Field = "category", Value = *something2* };
QueryContainer queryEqualsAnd = new TermQuery() { Field = "campaign", Value = 1 };
queryEqualsAnd &= new TermQuery() { Field = "in_stock", Value = 1 };
QueryContainer mainQuery = queryContainsAnd & queryEqualsAnd & queryEqualsOr;
Func<QueryContainerDescriptor<ClassForScore>, QueryContainer> fo = funcScoreParam(new ClassForScore(), filterItemNo, filterGroupOne, filterGroupTwo, filterDescription, mainQuery);
ISearchResponse<ClassForScore> srcSkor = elasticClient.Search<ClassForScore>(s => s
.RequestConfiguration(r => r.DisableDirectStreaming())
.Query(fo)
.Size(100)
);
IReadOnlyCollection<IHit<ClassForScore>> lstSkor = srcSkor.Hits;
double? dblSkorAvg = 0;
// Some calculation..
//.....
Func<QueryContainerDescriptor<items>, QueryContainer> fo2 = funcScoreParam(new ClassForScore(), filterItemNo, filterGroupOne, filterGroupTwo, filterDescription, mainQuery);
ISearchResponse<items> srcResult = elasticClient.Search<items>(s => s
.RequestConfiguration(r => r.DisableDirectStreaming())
.From(0)
.Size(100)
.Sort(S => S.Descending(SortSpecialField.Score).Ascending(r => r.item_no))
.MinScore(dblSkorAvg)
.Query(fo2)
);
private Func<QueryContainerDescriptor<T>, QueryContainer> funcScoreParam<T>(T nesne, QueryContainer filterItemNo, QueryContainer filterGroupOne, QueryContainer filterGroupTwo, QueryContainer filterDescription, QueryContainer mainQuery) where T : class
{
return new Func<QueryContainerDescriptor<T>, QueryContainer>(q => q
.FunctionScore(fsc => fsc
.BoostMode(FunctionBoostMode.Sum)
.ScoreMode(FunctionScoreMode.Sum)
.Functions(fu => fu
.Weight(w => w
.Weight(1000)
.Filter(wf => wf
.Bool(bb => bb
.Must(filterItemNo))
))
.Weight(w => w
.Weight(100)
.Filter(wf => wf
.Bool(bb => bb
.Must(filterGroupOne))
))
.Weight(w => w
.Weight(100)
.Filter(wf => wf
.Bool(bb => bb
.Must(filterGroupTwo))
))
.Weight(w => w
.Weight(50)
.Filter(wf => wf
.Bool(bb => bb
.Must(filterDescription))
))
)
.Query(q2 => q2
.Bool(b => b
.Should(mainQuery))
)
));
}
You can use the explain
parameter on the search API to return detailed information about score computation for each hit
ISearchResponse<items> srcResult = elasticClient.Search<items>(s => s
.RequestConfiguration(r => r.DisableDirectStreaming())
.From(0)
.Size(100)
.Sort(S => S.Descending(SortSpecialField.Score).Ascending(r => r.item_no))
.MinScore(dblSkorAvg)
.Query(fo2)
.Explain() // <-- explain score computation for each hit
);
There is also a dedicated explain API to understand how a specific document's score is calculated.