Search code examples
phpzend-frameworkcodeigniter

Zend Framework Lucene Boolean / "Google"-like search


I'm working on the application at http://demos.zatechcorp.com/codeigniter/

In its current incarnation running on my machine, I loaded the ZendFramework inside Codeigniter, and generated an index, like this:

    // ... Some code that loads all the markets
    foreach ($markets as $market)
    {
        $doc = new Zend_Search_Lucene_Document();
        // Id for retrieval
        $doc->addField(Zend_Search_Lucene_Field::UnIndexed('id', $market->id));
        // Store document URL to identify it in search result.
        $doc->addField(Zend_Search_Lucene_Field::Text('url', $market->permalink));
        // Index document content
        $doc->addField(Zend_Search_Lucene_Field::UnStored('contents', $market->description));
        // Title
        $doc->addField(Zend_Search_Lucene_Field::Text('title', $market->title));
        // Phone
        $doc->addField(Zend_Search_Lucene_Field::Keyword('phone', $market->phone));
        // Fax
        $doc->addField(Zend_Search_Lucene_Field::Keyword('fax', $market->fax));
        // Street
        $doc->addField(Zend_Search_Lucene_Field::Keyword('street', $market->street));
        // City
        $doc->addField(Zend_Search_Lucene_Field::Keyword('city', $market->city));
        // State
        $doc->addField(Zend_Search_Lucene_Field::Keyword('state', $market->state));
        // Zip
        $doc->addField(Zend_Search_Lucene_Field::Keyword('zip', $market->zip));
        // Type
        $doc->addField(Zend_Search_Lucene_Field::UnIndexed('type', 'market'));

        // Store Document
        $index->addDocument($doc);
    }

In my search, I do this:

    $hits    = $index->find($q);

This works with simple words, but when I want to do a search like "Sheba Foods" (quotes included), it returns one result, but the wrong one, which doesn't even have the word "Sheba".

I moved away from MySQL full-text search because of its obvious problems, and can't make any headway with this.

I've been looking at the Zend_Search_Lucene_Search_QueryParser::parse() method. Does the answer lie in this method?


Solution

  • I figured it out. With Lucene, you can add a field with the name 'id', but retrieving id from a hit gives you something different -- I'll guess this is the id of the search term within the entire search results.

    What I had to do in this case was use a different field name like this:

        // Id for retrieval
        $doc->addField(Zend_Search_Lucene_Field::UnIndexed('item_id', $market->id));