Search code examples
pythondjango-haystackwhoosh

Whoosh index viewer


I'm using haystack with whoosh as backend for a Django app.

Is there any way to view the content (in a easy to read format) of the indexes generated by whoosh? I'd like to see what data was indexed and how so I can better understand how it works.


Solution

  • You can do this pretty easily from python's interactive console:

    >>> from whoosh.index import open_dir
    >>> ix = open_dir('whoosh_index')
    >>> ix.schema
    <<< <Schema: ['author', 'author_exact', 'content', 'django_ct', 'django_id', 'id', 'lexer', 'lexer_exact', 'published', 'published_exact']>
    

    You can perform search queries directly on your index and do all sorts of fun stuff. To get every document I could do this:

    >>> from whoosh.query import Every
    >>> results = ix.searcher().search(Every('content'))
    

    If you wanted to print it all out (for viewing or whatnot), you could do so pretty easily using a python script.

    for result in results:
        print "Rank: %s Id: %s Author: %s" % (result.rank, result['id'], result['author'])
        print "Content:"
        print result['content']
    

    You could also return the documents directly from whoosh in a django view (for pretty formatting using django's template system perhaps): Refer to the whoosh documentation for more info: http://packages.python.org/Whoosh/index.html.