Search code examples
elasticsearchposition

Elasticsearch: what my index contains: docs or positions?


I've created ES index using the following command:

curl -X PUT -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{"settings" :{"number_of_shards" : 10, "number_of_replicas" : 0, "analysis":{"analyzer": {"my_analyzer": {"type": "custom", "tokenizer":"whitespace","filter":["lowercase","porter_stem"],"stopwords":[...stopwords here ...]}}}}, "mappings" : {"html" : {"properties" : "head" : { "type" : "text", "analyzer": "my_analyzer" }, "body" : { "type" : "text", "analyzer": "my_analyzer"}}}}}' localhost:9200/docs

I read here that:

Analyzed string fields use positions as the default, and all other fields use docs as the default.

Since my fields are of text type, are they considered string fields?

My main issue is how to know what does my index contain (docs or positions?) for each field! I used \docs\_settings command to get the index settings, but didn't get useful answer?

Any hints?

EDIT:

In addition answer of @ibexit below, I verified that practically by issuing phrase queries against ES indices.


Solution

  • You defined the fields as text, without specifying index_options in your mapping. In this case the default for text fields will be applied (index_options=positions). The inverse index will now contain doc number, term frequencies, and term positions (or order) for the text fields.

    For more in depth information about inverted indices please have a look on https://www.elastic.co/blog/found-elasticsearch-from-the-bottom-up or https://youtu.be/x37B_lCi_gc This should be a good starting point for your research.

    Cheers!