Search code examples
djangodjango-haystackwhoosh

Django Haystack Whoosh Multilanguage site


I am using haystack with whoosh backend in my django project. And my models are multilingual with the modeltranslation module, it creates automatic fields like title_tr, title_en for a field named title...

I am trying to make searches aware of selected language having searched the net and write the below lines but it is not working for the title_tr, entry_tags_tr fields... #search_indexes.py

from haystack import indexes
from aksak.blog.models import Entry

class EntryIndex(indexes.SearchIndex, indexes.Indexable):
    text = indexes.CharField(model_attr='descr_en', document=True, use_template=True)
    text_tr = indexes.CharField(model_attr='descr_tr')
    title_en = indexes.CharField(model_attr='title_en')
    title_tr = indexes.CharField(model_attr='title_tr')
    tags = indexes.MultiValueField(indexed=True, stored=True, model_attr='entry_tags')

    def get_model(self):
        return Entry


    # haystackCustomQuery.py  !!! IN URLS.PY I AM USING THIS CUSTOM VIEW <-------------

    from django.conf import settings
    from django.utils.translation import get_language
    from haystack.query import SearchQuerySet, DEFAULT_OPERATOR

    class MlSearchQuerySet(SearchQuerySet):
        def filter(self, **kwargs):
            if 'content' in kwargs:
                kwd = kwargs.pop('content')
                currentLngCode = str(get_language())
                lngCode = settings.LANGUAGE_CODE
                if currentLngCode == lngCode: 
                    kwdkey = "text" 
                    kwargs[kwdkey] = kwd
                else:
                    kwdkey = "text_%s" % currentLngCode
                    kwargs[kwdkey] = kwd


            if getattr(settings, 'HAYSTACK_DEFAULT_OPERATOR', DEFAULT_OPERATOR) == 'OR':
               return self.filter_or(**kwargs)
            else:
                return self.filter_and(**kwargs)

Solution

  • Not sure, but i think it might be related to the model_attr parameters of your SearchIndex subclass, which aren't properly resolved by haystack. Try to define some prepare_<index_fieldname> methods instead.

    I'm including a full example of what i have used in a (German/English) project. Just like you and inspired by search functionality on multi-language django site, i got the current language and mapped it to a SearchIndex field:

    from django.conf import settings
    from modeltranslation.utils import get_language
    from modeltranslation.settings import DEFAULT_LANGUAGE
    from haystack.query import SearchQuerySet, DEFAULT_OPERATOR
    
    class ModeltranslationSearchQuerySet(SearchQuerySet):
        def filter(self, **kwargs):
            if 'content' in kwargs:
                kwd = kwargs.pop('content')
                lang = get_language()
                if lang != DEFAULT_LANGUAGE:
                    kwdkey = "text_%s" % lang
                    kwargs[kwdkey] = kwd
                else:
                    kwargs['text'] = kwd
            if getattr(settings, 'HAYSTACK_DEFAULT_OPERATOR', DEFAULT_OPERATOR) == 'OR':
                return self.filter_or(**kwargs)
            else:
                return self.filter_and(**kwargs)
    

    In the essence there's one SearchIndex field per language, each with its own prepare_<index_fieldname> method. Here's a stripped down version of my search_indexes.py - it's not really generic, but worked well for my simple requirements:

    from haystack import indexes
    
    class ContentIndex(indexes.SearchIndex, indexes.Indexable):
        text = indexes.EdgeNgramField(document=True)  # German (default language)
        text_en = indexes.EdgeNgramField()  # English
    
        def prepare_text(self, obj):
            return '%s %s' % (obj.title_de, obj.descr_de)
    
        def prepare_text_en(self, obj):
            return '%s %s' % (obj.title_en, obj.descr_en)
    

    Note, that here text is used rather than text_de for the index field with document=True as it is a convention of haystack (you got that right). But inside the prepare_<index_fieldname> methods the actual translation fieldnames are used. It also doesn't use a template, but simple string concatenation.

    The generic SearchView included in haystack (i used 2.0.0-beta) takes a searchquery parameter, so the ModeltranslationSearchQuerySet can be passed directly through the urls setup like this (as far as i understand you have that):

    from haystack.forms import ModelSearchForm
    
    url(r'^search/$', SearchView(
        searchqueryset=ModeltranslationSearchQuerySet(), form_class=ModelSearchForm))
    

    Final note: I haven't used that code in a while and modeltranslation had some major changes regarding current language awareness. It might very well be that it can be simplified a bit for modeltranslation >=0.6.