Search code examples
restelasticsearchtwitter4j

Elasticsearch indexing of Twitter bounding box not recognized as a geo_shape


I'm trying to create an Elasticsearch mapping for Twitter's Place geo bounding_box array and I can't get Elasticsearch to index it as a geo bounding box. In my app, I will be getting the raw JSON from Twitter4j, however the bounding box does not close the bounding box, so for the purpose of this test, I edited the json and closed it. I'm using Elastic cloud (ES v5) and the Rest API and then visualizing with Kibana.

Here is the mapping I'm trying to use. I've tried several variations with and without a "properties" block and it doesn't work. With this mapping, I am successfully able to PUT the mapping, but when I POST the document, Kibana recognizes the array as an unknown field type.

The Point coordinates field is indexed as a geopoint just fine, it's the bounding box that does not.

Here is my mapping:

PUT /testgeo

{
    "mappings": {
        "tweet": {
            "_all": {
                "enabled": false
            },
            "properties": {
                "created_at": {
                    "type": "date",
                    "format": "EEE MMM dd HH:mm:ss Z YYYY||strict_date_optional_time||epoch_millis"
                },
                "coordinates": {
                    "properties": {
                        "coordinates": {
                            "type": "geo_point",
                            "ignore_malformed": true
                        }
                    }
                },
                "place": {
                    "properties": {
                        "bounding_box": {
                            "type": "geo_shape",
                            "tree": "quadtree",
                            "precision": "1m"
                        }
                    }
                }
            }
        }
    }
}

Here is the snippet of the document I am trying to POST (NOTE: I manually added the 5th array element to close the bounding box).

POST /testgeo/tweet/1

{
    ...
    "coordinates": {
        "type": "point",
        "coordinates": [
            0.78055556,
            51.97222222
        ]
    },
    "place": {
        "id": "0c31a1a5b970086e",
        "url": "https:\/\/api.twitter.com\/1.1\/geo\/id\/0c31a1a5b970086e.json",
        "place_type": "city",
        "name": "Bures",
        "full_name": "Bures, England",
        "country_code": "GB",
        "country": "United Kingdom",
        "bounding_box": {
            "type": "polygon",
            "coordinates": [
                [
                    [
                        0.773779,
                        51.96971
                    ],
                    [
                        0.773779,
                        51.976437
                    ],
                    [
                        0.781794,
                        51.976437
                    ],
                    [
                        0.781794,
                        51.96971
                    ],
                    [
                        0.773779,
                        51.96971
                    ]
                ]
            ]
        },
        "attributes": {
        }
    },

If anyone can identify the reason for this and correct it, I would be most appreciative.

NOTE 1:: I tried using the mapping and document examples from Elastic's geo_shape documentation page and Kibana again showed the location field as unknown type.

PUT /testgeo

{
    "mappings": {
        "tweet": {
            "_all": {
                "enabled": false
            },
            "properties": {
                "location": {
                    "type": "geo_shape",
                    "tree": "quadtree",
                    "precision": "1m"
                }
            }
        }
    }
}

POST /testgeo/tweet/1

{
    "location" : {
        "type" : "polygon",
        "coordinates" : [
            [ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0] ]
        ]
    }
}

Solution

  • Turns out that Kibana simply does reflect the type for GeoShape's. When doing a geo query, however, Elasticsearch returns correct results.

    For example:

      "query": {
        "bool": {
          "must": {
            "match_all": {}
          },
          "filter": {
            "geo_shape": {
              "place.bounding_box": {
                "shape": {
                  "type": "polygon",
                  "coordinates": [
                    [
                        [
                            0.773779,
                            51.96971
                        ],
                        [
                            0.773779,
                            51.976437
                        ],
                        [
                            0.781794,
                            51.976437
                        ],
                        [
                            0.781794,
                            51.96971
                        ],
                        [
                            0.773779,
                            51.96971
                        ]
                    ]
                  ]
                },
                "relation": "within"
              }
            }
          }
        }
      }
    }