Search code examples
django-haystackwhoosh

Sharing Whoosh index


I'm implementing a CMS site in Django, and I'd like to add full-text content search. The site is reasonably small and will generate low search traffic, so I think Whoosh will be a reasonable production solution.

My current understanding is that the Whoosh indexing and result generation happens in the application process, rather than requiring its own daemon, which is great. However, I'm a bit worried about concurrent access to the index. Can a single Whoosh index support reads (and potentially writes) from multiple, uncoordinated processes? For example, will it be problematic if the same index is shared by load-balanced Django application servers, either in terms of serious performance degradation or index corruption?

Thanks in advance for your advice.


Solution

  • It appears, based on the Whoosh documentation, that it is possible to share an index between several threads/processes. The docs on indexing here: http://packages.python.org/Whoosh/indexing.html#indexing-documents indicate that the index is locked for writes when updating it, so I'd imagine that a highly read-heavy application would be mostly ok.