Search code examples
pythonsearchdjango-haystackwhoosh

haystack multiple field search


Hi i m using haystack with a woosh as search engine:

my model looks as follows

class Person(models.Model):
    personid = models.IntegerField(primary_key = True, db_column = 'PID')  
    firstname = models.CharField(max_length = 50, db_column = 'FIRSTNAME')  
    lastname = models.CharField(max_length = 50, db_column = 'LASTNAME') 
    class Meta:
        db_table = '"TEST"."PERSON"'
        managed = False


class TDoc(models.Model):
    tdocid = models.IntegerField(primary_key = True, db_column = 'TDOCID')  
    person = models.ForeignKey(Person, db_column = 'PID')
    content = models.TextField(db_column = 'CONTENT', blank = True) 
    filepath = models.TextField(db_column = 'FILEPATH', blank = True) 
    class Meta:
        db_table = '"TEST"."TDOC"'
        managed = False

The search_index.py is as follows:

class TDocIndex(SearchIndex):

    content = CharField(model_attr = 'content', document = True)
    filepaht = CharField(model_attr = 'filepath')
    person = CharField(model_attr = 'person')

    def get_queryset(self):
        return TDoc.objects.all()

    def prepare_person(self, obj):
        # Store a list of id's for filtering
        return obj.person.lastname

site.register(TDoc, TDocIndex)

My problem is i would like to do multiple filed searches like

content:xxx AND person:SMITH

On haystack it search all of them at once i can't do specific field search. I suspected that my index is corrupt but:

ix = open_dir("/testindex")

searcher = ix.searcher()

mparser = MultifieldParser(["content", "filepath", "person"], schema = ix.schema)
myquery = mparser.parse(content:xxx AND person:SMITH')
results = searcher.search(myquery)
for result in results:
    print result

but it works and return's the correct value. I m using the standard haystack SearchView,search.html from the tutorial

(r'^search/', include('haystack.urls')),

Solution

  • In your index you should define one field with document=True, which is the document haystack will search on. By convention this field is named text. You add extra fields if you plan to do filtering or ordering on their values.

    The way to take several fields in account when you perform a search, is to define the document as a template, and set use_template on your document field. Your Index would look like:

    class TDocIndex(SearchIndex):
    
        text = CharField(document=True, use_template=True)
    
        #if you plan to filter by person
        personid = IntegerField(model_attr='person__id') 
    
    site.register(TDoc, TDocIndex)
    

    And you'd have a search/indexes/tdoc_text.txt template like:

    {{ object.content }}
    {{ object.filepath }}
    {{ object.person.lastname }}
    

    See this answer.