Search code examples
searchcouchdbibm-cloudcloudantnosql

How to create index search in CouchDB?


Assuming configuring couchDB locally, how and where to create the search index similarly to Cloudant on Bluemix?

enter image description here


Solution

  • The solution I was searching for for was based on this library.

    1. I had to install CouchDB 1.6.1 to have available database on http://localhost:5984,
    2. the next step was to install couchdb-lucene, which was running on http://localhost:5985 with successfull response. It is maven based app.

    {"couchdb-lucene":"Welcome","version":"1.1.0-SNAPSHOT"}

    To make it run I had to build it in the root directory with mvn and then navigate to target and run command ./bin/run in the unzipped couchdb-lucene:

    root@mario-VirtualBox:/home/mario/CouchDB_mario/couchdb-lucene/target/couchdb-lucene-1.1.0-SNAPSHOT# ./bin/run
    
    1. The next constraint was to connect these two servers together. And all I had to do was to map them via proxy in the /etc/couchdb/local.ini

    All what you need to have there is the following piece of code:

    [httpd_global_handlers]
    _fti = {couch_httpd_proxy, handle_proxy_req, <<"http://localhost:5985">>}
    

    Thanks to which, I was able to finally query CouchDB using Apache Lucene indexing.

    1. Before querying I had to insert my custom JSON Design Document, not new design through the UI, neither new view, but new JSON Document. Essentially hacking CouchDB a little bit with faked design so that could support Lucene search. I've used CURL request with the following format

    curl -X PUT http://localhost:5984/user14169_slovnik_medical/_design/medical -d @user14169_slovnik_medical.json

    Where the JSON Design Document looked like this:

     {
       "_id": "_design/medical",
       "fulltext": {
           "by_meaning": {
               "index": "function(doc) { var ret=new Document(); ret.add(doc.vyznam); return ret }"
           },
           "by_shortcut": {
               "index": "function(doc) { var ret=new Document(); ret.add(doc.zkratka); return ret }"
           }
       }
    }
    
    1. As an example. Having this search index defined and let's say this type of data in the JSON Documents:
      {
         "_id": "63e5c848fa2211c3b063d6feccd3d942",
         "_rev": "1-899a6924ed08097b1a37e497d91726fd",
         "DATAWORKS_DOCUMENT_TYPE": "user14169_slovnik_medical",
         "vyznam": "End to side",
         "zkratka": "e-t-s"
       }
    

    Then you are easiliy able to achieve queries like this:

    http://localhost:5984/_fti/local/user14169_slovnik_medical/_design/medical/by_meaning?q=lob~
    

    Which returns the expected data: enter image description here

    The local prefix is because I am running the database on localhost on 1 node and by default couchdb-lucene is connecting to the localhost.

    The coolest thing is that you are able to use client API org.lightcouch jar library in Java and do some easy calls like this:

    CouchDbClient dbClient = new CouchDbClient("user14169_slovnik_medical", true, "http", "127.0.0.1", 5984, null, null);
    
    String uriFullText = dbClient.getBaseUri() + "_fti/local/user14169_slovnik_medical/_design/medical/by_shortcut?q=lob*";
    
    JsonObject result = dbClient.findAny(JsonObject.class, uriFullText);
    
    System.out.println(result.toString());