Search code examples
pythonsolrfull-text-searchsphinxwhoosh

Among Lucene/Solr, Whoosh, Sphinx, Xapian which integrates best with python?


I am a newb coder in a startup and I am implementing search of documents in a directory in a web host.

I am comparing Lucene/Solr, Whoosh, Sphinx and Xapian. Whoosh is natively python. But I want your opinions on it too. Which of these have

  • mature and easy to use and install interfaces with python? (Whoosh is a no-brainer)
  • no chance for crashes, bottlenecks and other failures
  • best documented interface (Im not reading PHP docs because python docs were sparse)
  • easiest to get up and running (only one has a quick-start tutorial)

Solution

  • Speaking for Apache Solr, Python has several Solr clients, which I've collected based on feedback from our customers at Websolr:

    1. Haystack is very popular, and designed for seamless integration within Django apps. If you're developing a Django app, Haystack is for you.
    2. Sunburnt looks to be more generic than Haystack, and is also very well documented. If you're doing plain ol' Python, Sunburnt is worth a look.

    Other Python Solr clients that I've found, which seem a bit lower level...

    Some more details about how your app is built (in particular, is it a Django app?) would help narrow things down from here. Good luck finding the best fit for your app!