Following are a couple of sample documents in my elasticsearch index:
{
message: "M1",
date: "date object",
comments: [
{
"msg" :"good"
date:"date_obj1"
},
{
"msg" :"bad"
date:"date_obj2"
},
{
"msg" :"ugly"
date:"date_obj3"
}
]
}
and
{
message: "M2",
date: "date_object5",
comments: [
{
"msg" :"ugly"
date:"date_obj7"
},
{
"msg" :"pagli"
date:"date_obj8"
}
]
}
Now I need to find number of documents per day and number of comments per day. I can get the number of documents per day by using the date histogram and it gives me the correct results. I make the following aggregation query
aggs : {
"posts_over_days" : {
"date_histogram" : { "field" : "date", "interval": "day" }
}
}
But when I try similar thing to get comments per day, it returns incorrect data, (for 1500+ comments it will only return 160 odd comments). I am making the following query:
aggs : {
"comments_over_days" : {
"date_histogram" : { "field" : "comments.date", "interval": "day" }
}
}
I want to know how to get the desired result? Is there a way in elasticsearch to get what I want? Please let me know if I need to provide any other info.
Expected Output:
buckets: [
{
time_interval: date_objectA,
doc_count: x
},
{
time_interval: date_objectB,
doc_count: y
}
]
use Value Count aggregation - this will count the number of terms for the field in your document. E.g. based on your data (5 comments in 2 documents):
curl -XGET 'http://localhost:9200/myindex/mydata/_search?search_type=count&pretty' -d '{
> "aggs" : {
> "grades_count" : { "value_count" : { "field" : "comments.date" } }
> }
> }'
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 0.0,
"hits" : [ ]
},
"aggregations" : {
"grades_count" : {
"value" : 5
}
}
}
the Value Count aggregation can be nested inside the date buckets:
curl -XGET 'http://localhost:9200/myindex/mydata/_search?search_type=count&pretty' -d '{
aggs : {
"posts_over_days" : {
"date_histogram" : { "field" : "date", "interval": "day" },
"aggs" : {
"grades_count" : { "value_count" : { "field" : "comments.date" } }
}
}
}
}'
with results:
"aggregations" : {
"posts_over_days" : {
"buckets" : [ {
"key_as_string" : "2014-11-27T00:00:00.000Z",
"key" : 1417046400000,
"doc_count" : 1,
"grades_count" : {
"value" : 2
}
}, {
"key_as_string" : "2014-11-28T00:00:00.000Z",
"key" : 1417132800000,
"doc_count" : 1,
"grades_count" : {
"value" : 3
}
}