I'm using haystack with whoosh as backend for a Django app.
Is there any way to view the content (in a easy to read format) of the indexes generated by whoosh? I'd like to see what data was indexed and how so I can better understand how it works.
You can do this pretty easily from python's interactive console:
>>> from whoosh.index import open_dir
>>> ix = open_dir('whoosh_index')
>>> ix.schema
<<< <Schema: ['author', 'author_exact', 'content', 'django_ct', 'django_id', 'id', 'lexer', 'lexer_exact', 'published', 'published_exact']>
You can perform search queries directly on your index and do all sorts of fun stuff. To get every document I could do this:
>>> from whoosh.query import Every
>>> results = ix.searcher().search(Every('content'))
If you wanted to print it all out (for viewing or whatnot), you could do so pretty easily using a python script.
for result in results:
print "Rank: %s Id: %s Author: %s" % (result.rank, result['id'], result['author'])
print "Content:"
print result['content']
You could also return the documents directly from whoosh in a django view (for pretty formatting using django's template system perhaps): Refer to the whoosh documentation for more info: http://packages.python.org/Whoosh/index.html.