When doing a search operation in elasticsearch i want the metadata to be filtered out and return only "_source" in the response. I'm able to achieve the same through "search" in the following way:
out1 = es.search(index='index.com', filter_path=['hits.hits._id', 'hits.hits._source'])
But when i do the same with scan method it just returns an empty list:
out2 = helpers.scan(es, query, index='index.com', doc_type='2016-07-27',filter_path= ['hits.hits._source'])
The problem may be with the way i'm processing the response of 'scan' method or with the way i'm passing the value to filter_path. To check the output i parse out2 to a list.
The scan
helper currently doesn't allow passing extra parameters to the scroll
API so your filter_path
doesn't apply to it. It does, however, get applied to the initial search
API call which is used to initiate the scan/scroll
cycle. This means that the scroll_id
is stripped from the response causing the entire operation to fail.
In your case even passing the filter_path
parameter to the scroll
API calls would cause the helper to fail because it would strip the scroll_id
which is needed for this operation to work and also because the helper relies on the structure of the response.
My recommendation would be to use source filtering if you need to limit the size of the response or use smaller size
parameter than the default 1000
.
Hope this helps, Honza