Search code examples
djangodjango-haystack

Update Django Haystack search index for prepared field


I'm using Django Haystack. Here is my code:

settings.py

HAYSTACK_CONNECTIONS = {
    'default': {
        'ENGINE': 'haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine',
        'URL': 'http://127.0.0.1:9200/',
        'INDEX_NAME': 'haystack',
    },
}

HAYSTACK_SIGNAL_PROCESSOR = 'haystack.signals.RealtimeSignalProcessor'

search_indexes.py

class PostIndex(indexes.SearchIndex, indexes.Indexable):
    text = indexes.CharField(document=True, use_template=True)
    owner = indexes.CharField(model_attr='owner')
    image_url = indexes.CharField()

    def get_model(self):
        return Post

    def prepare_image_url(self, obj):
        # Get first images for resulted search objects
        return [image.image_main_page.url for image in obj.images.order_by('id')[:1]]

    def index_queryset(self, using=None):
        """Used when the entire index for model is updated."""
        return self.get_model().objects.all()

As you see I use RealtimeSignalProcessor to make it index on Post instance creation or update. And it actually does index the instance on creation except image_url field which is using prepare method. It indexed though on instance update.

Question is why it isn't being indexed on creation?

Any pointers are appreciated.


Solution

  • I ended up with custom signal processor like so:

    class RelatedRealtimeSignalProcessor(RealtimeSignalProcessor):
    """
    Extension to haystack's RealtimeSignalProcessor not only causing the
    search_index to update on saved model, but also for image url, which is needed to show
    images on search results
    """
    
    def handle_save(self, sender, instance, **kwargs):
        if hasattr(instance, 'reindex_related'):
            for related in instance.reindex_related:
                related_obj = getattr(instance, related)
                self.handle_save(related_obj.__class__, related_obj)
        return super(RelatedRealtimeSignalProcessor, self).handle_save(sender, instance, **kwargs)
    
    def handle_delete(self, sender, instance, **kwargs):
        if hasattr(instance, 'reindex_related'):
            for related in instance.reindex_related:
                related_obj = getattr(instance, related)
                self.handle_delete(related_obj.__class__, related_obj)
        return super(RelatedRealtimeSignalProcessor, self).handle_delete(sender, instance, **kwargs)
    

    And pointed to it in settings:

    HAYSTACK_SIGNAL_PROCESSOR = 'your_app.signals.RelatedRealtimeSignalProcessor'