Search code examples
elasticsearchelasticsearch-2.0

Adding uax_url_email analyzer to Elasticsearch 2.4.5


I'm trying to add an analyzer that uses the uax_url_email tokenizer.

▶ elasticsearch --version
Version: 2.4.5, Build: c849dd1/2017-04-24T16:18:17Z, JVM: 1.8.0_131

curl -XPUT http://localhost:9200/timeline -H 'Content-Type: application/json' -d'
{
    "settings": {
        "analysis": {
            "analyzer": {
                "email_analyzer": {
                    "type": "custom",
                    "tokenizer": "uax_url_email"
                }
            }
        }
    }
}'

However this complains that the index already exists.

{
    "error": {
        "index": "timeline",
        "reason": "already exists",
        "root_cause": [
            {
                "index": "timeline",
                "reason": "already exists",
                "type": "index_already_exists_exception"
            }
        ],
        "type": "index_already_exists_exception"
    },
    "status": 400
}

So I tried doing an update via PATCH

curl -XPATCH http://localhost:9200/timeline -H 'Content-Type: application/json' -d'
{
    "settings": {
        "analysis": {
            "analyzer": {
                "email_analyzer": {
                    "type": "custom",
                    "tokenizer": "uax_url_email"
                }
            }
        }
    }
}'

This doesn't complain about any issues, returns no errors and the returned output is the same as if I'd issued a GET request to the /timeline index

The interesting part of the output is that the settings haven't updated.

    "settings": {
        "index": {
            "creation_date": "1497609042039",
            "number_of_replicas": "1",
            "number_of_shards": "5",
            "uuid": "XaRS0KN1SLWcBsl6eLMZcg",
            "version": {
                "created": "2040599"
            }
        }
    },

I perhaps wrongly would expect the newly PATCHED analysis object to be present...

Not sure where I'm going wrong here.


Solution

  • You need to first close the index and then open it again:

    curl -XPOST 'localhost:9200/timeline/_close'
    
    curl -XPUT 'localhost:9200/timeline/_settings' -d '{
      "analysis" : {
        "analyzer":{
          "email_analyzer":{
            "type":"custom",
            "tokenizer":"uax_url_email"
          }
        }
      }
    }'
    
    curl -XPOST 'localhost:9200/timeline/_open'