For 1000+ records satisfying the search criteria in filter query solr collection gives different percentile value every time. I've been using same filter query and using json facet query to get percentile inside one queryfacet.
Sample Query : `
json.facet = {
time: "sum(time)",
users: "sum(numofusers)",
queryfacet: {
q: "time: [0 TO 50000}",
type: query,
facet: {
timepercentile: "percentile(time, 95)"
}
}
}
`
The percentile
function is an approximation and is not an exact value. It's documented under the stats functions:
percentiles
A list of percentile values based on cut-off points specified by the parameter value, such as 1,99,99.9. These values are an approximation, using the t-digest algorithm. This statistic is computed for numeric field types and is not computed by default.
The percentile
function in the JSON Facet API uses the same method:
Percentile estimates via t-digest algorithm. When sorting by this metric, the first percentile listed is used as the sort value.
You can read more about the t-digest algorithm on the GitHub repository.
Since these values are based on estimates, I'm guessing there's some minor variance in which elements gets sampled; it might also depend on the structure of your index (number of nodes, when they get updated, when the commits gets issued, etc.).