I have an index in solr which has say the following fields:
name
address
mobile
email
all fields are of type text_general
. the problem is, in some items, some fields are missing and I want to count these. So, for example, here is an item on the index with only the following two fields:
name
address
So, for example, I would like to count the total number of items with field name
populated (ie name is exists in that item).
I think it sounds trivial but reading the docs I am unsure how to formulate such a query.
I have tried this, thanks to @MatsLindh - here I am searching for all items with field full_name
which all of them have got:
>>> requests.get('http://localhost:8984/solr/solr_search_service/select', params={
... 'facet': 'true',
... 'facet.query': 'full_name:[* TO *]',
... 'wt': 'json',
... 'rows': 2,
... 'start': 0,
... }).json()
{'responseHeader': {'status': 0, 'QTime': 10, 'params': {'facet.query': 'full_name:[* TO *]', 'start': '0', 'rows': '2', 'facet': 'true', 'wt': 'json'}}, 'response': {'numFound': 0, 'start': 0, 'numFoundExact': True, 'docs': []}, 'facet_counts': {'facet_queries': {'full_name:[* TO *]': 0}, 'facet_fields': {}, 'facet_ranges': {}, 'facet_intervals': {}, 'facet_heatmaps': {}}}
but somehow gives me zero as the count :(
You can use faceting to get counts for each of the fields you want to, without having to make separate queries for each case.
By using facet.query
you can run a single query across the documents returned from your main query (q
). You can give multiple facet.query
parameters, effectively creating counts for how many documents miss each of the fields (and if you want to filter the document set first, give a query in q
- for example "how many of the documents with abc
in the field tag
is missing name").
?q=*:*&facet=true&facet.query=name:[* TO *]&facet.query=address:[* TO *]