Search code examples
solrlucene

solr count items with certain fields


I have an index in solr which has say the following fields:

name
address
mobile
email

all fields are of type text_general. the problem is, in some items, some fields are missing and I want to count these. So, for example, here is an item on the index with only the following two fields:

name
address

So, for example, I would like to count the total number of items with field name populated (ie name is exists in that item).

I think it sounds trivial but reading the docs I am unsure how to formulate such a query.

I have tried this, thanks to @MatsLindh - here I am searching for all items with field full_name which all of them have got:

>>> requests.get('http://localhost:8984/solr/solr_search_service/select', params={
... 'facet': 'true',
... 'facet.query': 'full_name:[* TO *]',
... 'wt': 'json',
... 'rows': 2,
... 'start': 0,
... }).json()
{'responseHeader': {'status': 0, 'QTime': 10, 'params': {'facet.query': 'full_name:[* TO *]', 'start': '0', 'rows': '2', 'facet': 'true', 'wt': 'json'}}, 'response': {'numFound': 0, 'start': 0, 'numFoundExact': True, 'docs': []}, 'facet_counts': {'facet_queries': {'full_name:[* TO *]': 0}, 'facet_fields': {}, 'facet_ranges': {}, 'facet_intervals': {}, 'facet_heatmaps': {}}}

but somehow gives me zero as the count :(


Solution

  • You can use faceting to get counts for each of the fields you want to, without having to make separate queries for each case.

    By using facet.query you can run a single query across the documents returned from your main query (q). You can give multiple facet.query parameters, effectively creating counts for how many documents miss each of the fields (and if you want to filter the document set first, give a query in q - for example "how many of the documents with abc in the field tag is missing name").

    ?q=*:*&facet=true&facet.query=name:[* TO *]&facet.query=address:[* TO *]