Search code examples
elasticsearchcouchbaseelasticsearch-plugin

Couchbase documents mapping to elasticsearch


I am trying to replicate data from couch base bucket to elasticsearch index "test".I have done below settings for my "test index".

"settings": {
"analysis": {
  "analyzer": {
    "my_analyzer":{
    "type":"custom",
    "tokenizer" : "standard",
    "filter" : ["standard", "lowercase","asciifolding","my_stemmer","autocomplete","my_stop","my_synonym_filter"]
    }
  },
  "filter": {
    "my_stemmer":{
      "type":"stemmer",
      "name":"english"
    },
    "autocomplete":{
      "type":"edge_ngram",
      "min_gram":1,
      "max_gram":20
    },
    "my_stop":{
      "type":"stop",
       "stopwords":"_english_"
    },
    "my_synonym_filter":{
      "type":"synonym",
      "synonyms": [
        "united states,u s a,united states of                   america=>usa"
      ]
    }
  }
}

I have mapping for type- "profile" is below.

 "profile":{
    "properties": {
    "name":{
      "type": "string",
      "index_analyzer": "my_analyzer",
      "search_analyzer": "english"
    },
    "title":{
      "type": "string",
      "index_analyzer": "my_analyzer",
      "search_analyzer": "english"
    },
    "description":{
      "type": "string",
      "search_analyzer": "english",
      "index_analyzer": "my_analyzer"
    },

My couch base document is below.

 {
  "name": "xxxx",
  "title": "junior android developer",
  "description": "I am developing new android applications",}

My question is,

  1. When i replicate this document to elasticsearch, How can i use this settings and mapping for this couch base document?

  2. couch base transport plugin by default map this document to "couchbaseDocument" type and elasticsearch automatically map this document.How can i change this behaviour?

Please help me. Thank you very much in advance.


Solution

  • To have all "profile:xxx" documents mapped to the "profile" type in ES, you just need to add the type settings in the plugin configuration. In your case, add the following to the elasticsearch.yml config file on each of the ES nodes:

    couchbase.typeSelector: org.elasticsearch.transport.couchbase.capi.DelimiterTypeSelector
    

    The DelimiterTypeSelector splits the document id by a delimiter (':' by default) and uses the first token as the document type, which is exactly what you want. Once the documents are mapped to the correct type, ES will use the mappings you configured automatically.

    Take a look here for some of the other advanced settings you can use. In particular, you might want to use the couchbase.ignoreFailures setting.