Search code examples
cassandralucenestratio

Stratio Lucene for Cassandra


I am newbie to Lucene. Just started. I have a few basic questions:

  • How to view all the indexes that are created using Stratio Lucene ?

  • How to delete indexes created using Stratio Lucene ?

  • What is the difference between

    fields: {
         fld_1: {type: "string"},
         fld_2: {type: "text"}
     }
    

type: "string" and type: "text"

The reason I ask for the difference is because I ran in to an error when trying to create my very first lucene index. My column in Cassandra is something like this: 'fld_1 text', but when I tried to create and index on fld_1 like above it threw an exception

ConfigurationException: 'schema' is invalid : Unparseable JSON schema: Unexpected character ('}' (code 125)): was expecting either valid name character (for unquoted name) or double-quote (for quoted) to start field name
at [Source: {
fields: {

The Lucene index script:

CREATE CUSTOM INDEX lucene_index ON testTable ()
USING 'com.stratio.cassandra.lucene.Index'
WITH OPTIONS = {
   'refresh_seconds': '1',
   'schema': '{
fields: {
     fld_1: {type: "string"},
     fld_2: {type: "string"},
     id: {type: "integer"},
     test_timestamp: {type: "date", pattern: "yyyy/MM/dd HH:mm:ss"}
  }
}'
};

Thanks!


Solution

  • First : You can't view only the Stratio Lucene Index, below query will show you all the index

    SELECT * FROM system."IndexInfo"; 
    

    Second : You can delete index with DROP INDEX index_name command. i.e

    DROP INDEX test;
    

    Third : In Stratio Lucene Index, string is a not-analyzed text value and text is a language-aware text value analyzed according to the specified analyzer.

    Which means that if you specify a field as string it will directly index and queried. But if you use text then it will first analyzed by your specified analyzer, default is default_analyzer(org.apache.lucene.analysis.standard.StandardAnalyzer) then index and queried.

    Edited :

    You have to first create a text field in cassandra then specified it when creating index.

    Example :

    ALTER TABLE testtable ADD lucene text;
    
    CREATE CUSTOM INDEX lucene_index ON testTable (lucene) USING 'com.stratio.cassandra.lucene.Index'
    WITH OPTIONS = {
       'refresh_seconds': '1',
       'schema': '{
         fields: {
             fld_1: {type: "string"},
             fld_2: {type: "string"},
             id: {type: "integer"},
             test_timestamp: {type: "date", pattern: "yyyy/MM/dd HH:mm:ss"}
         }
       }'
    };
    

    For more : https://github.com/Stratio/cassandra-lucene-index/blob/branch-3.0.13/doc/documentation.rst#text-mapper