Search code examples
cassandralucenestratiocassandra-lucene-index

How do you do lowercase prefix filtering with Stratio Cassandra Lucene Index


Is it possible to do lowercase prefix filtering/querying with Stratio Cassandra Lucene Index? I haven't been able to find documentation on this particular use case.


Solution

  • The casing of queries depends on how the labour of the Lucene text analyzer used during indexing, and can't be decided in query time. If you want to do case insensitive prefix searches, you should use a mapper with an analyzer producing lowercase terms. These terms will be indexed and matched at search time. For example:

    CREATE KEYSPACE test
    WITH REPLICATION = {'class' : 'SimpleStrategy', 'replication_factor': 1};
    USE test;
    CREATE TABLE test (
        id INT PRIMARY KEY,
        body TEXT
    );
    
    CREATE CUSTOM INDEX test_index ON test ()
    USING 'com.stratio.cassandra.lucene.Index'
    WITH OPTIONS = {
        'refresh_seconds' : '1',
        'schema' : '{
            fields : {
                body1 : {type :"string", column:"body", case_sensitive:false},
                body2 : {type :"string", column:"body", case_sensitive:true}
            }
        }'
    };
    
    INSERT INTO test(id,body) VALUES ( 1, 'foo');
    INSERT INTO test(id,body) VALUES ( 2, 'Foo');
    INSERT INTO test(id,body) VALUES ( 3, 'bar');
    INSERT INTO test(id,body) VALUES ( 4, 'Bar');
    
    
    SELECT * FROM test WHERE expr(test_index, 
       '{filter:{type:"prefix", field:"body2", value:"f"}}'); -- Returns foo
    SELECT * FROM test WHERE expr(test_index, 
       '{filter:{type:"prefix", field:"body2", value:"F"}}'); -- Returns Foo
    
    SELECT * FROM test WHERE expr(test_index, 
       '{filter:{type:"prefix", field:"body1", value:"f"}}'); -- Returns foo and Foo
    SELECT * FROM test WHERE expr(test_index, 
       '{filter:{type:"prefix", field:"body1", value:"F"}}'); -- Returns no results