Search code examples
pythonelasticsearchurllib2aggregation

How to make urlib return an array in python - Parse the results of a call to elasticsearch?


For example, I would like to aggregate by state, but the following returns data tyep is string not an array . How can I write an Elasticsearch terms aggregation that return an array ?

part of my code:

import urllib2 as urllib
import json
query = {
    "size":0,
  "aggs":{
     "states":{
        "terms":{
           "field":"states.raw",
            "size":8
        }
     }
  }

}

query = json.dumps(query )
headers = {'Content-type': 'application/json'}
req = urllib2.Request(url, query , headers)
out = urllib2.urlopen(req)
rs = out.read()

print type(rs )

return :

{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 2,
      "max_score": 0,
      "hits": []
   },
   "aggregations": {
      "states": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": "New York",
               "doc_count": 200
            },
            {
               "key": "California",
               "doc_count": 10
            },
            {
               "key": "New Jersey",
               "doc_count": 10
            },
            {
               "key": "North Carolina",
               "doc_count": 1802
            },
            {
               "key": "North Dakota",
               "doc_count": 125
            }
         ]
      }
   }
}

I try to get return data by rs['aggregations']['states']['buckets'][0]['key'] but get the error msg

"TypeError: string indices must be integers, not str"

I found the return data type is string ,how to make the return data is an array ?


Solution

  • Run

    import json 
    ... 
    rs = json.loads(rs)
    

    Then rs will become an object that you access using s['aggregations']['states']['buckets'][0]['key']

    However, it's recommended that you use the python client for elasticsearch instead of writing your own, as the latter already handles what you are looking for among other things. Check my answer here for an example on how to run a query using elasticsearch-py.

    Here's the link for elasticsearch-py's documentation: