Search code examples
symfonyelasticsearchfoselasticabundle

Elasticsearch - Count distinct


I have a basic index with logs

Some logs are visit of user1 to user2

I managed to count the total of visits a user has received, but I don't know how count the total of distinct users a user has receivedenter image description here

This is giving me all the logs for a user

{
    "post_filter":{
        "bool":{
            "must":[
                {
                    "term":{
                        "message":"visit"
                    }
                },
                {
                    "term":{
                        "ctxt_user2":"733264"
                    }
                }
            ]
        }
    },
    "query":{
        "match_all":{}
    }
}

Actually, I'm using FoSElasticaBundle for Symfony2

$filter->addMust((new Term())->setTerm('message', 'visit'));
$filter->addMust((new Term())->setTerm('ctxt_user2', $this->search->getVisit()));

I read some pages in the ES doc with aggregator, but I never managed to get what I want

Convert to SQL, I just need

SELECT COUNT(DISCTING ctxt_user1)
FROM logs
WHERE ctxt_user2 = 733264

EDIT:

Cardinality seams to be what I need. Now just need to find how use it with FosElasticaBundle

"aggs": {
    "yourdistinctcount": {
        "cardinality": {
            "field": "ctxt_user1"
        }
    }
}

Solution

  • Try this query ( not tested...):

    {
    "query" : {
     "bool":{
             "must":[
                 {
                     "term":{
                         "message":"visit"
                    }
                },
                {
                    "term":{
                        "ctxt_user2":"733264"
                    }
                }
            ]
        }
    },
    "aggs": {
      "yourdistinctcount": {
        "terms": {
          "field": "ctxt_user1"
         }
       }
     }
    }
    

    The post_filter query cannot be used in your case. As it write on Elastic.co website: The post_filter is applied to the search hits at the very end of a search request, after aggregations have already been calculated.`

    HtH,