I'm looking to add search capability into an existing entirely static website. Likely, the new search functionality itself would need to be dynamic, as the search index would need to be updated periodically (as people make changes to the static content), and the search results will need to be dynamically produced when a user interacts with it. I'd hope to add this functionality using Python, as that's my preferred language, though am open to ideas.
The Google Web Search API won't work in this case because the content being indexed is on a private network. Django haystack won't work for this case, as that requires that the content be stored in Django models. A tool called mnoGoSearch might be an option, as I think it can spider a website like Google does, but I'm not sure how active that project is anymore; the project site seems a bit dated.
I'm curious about using tools like Solr, ElasticSearch, or Whoosh, though I believe that those tools are only the indexing engine and don't handle the parsing of search content. Does anyone have any recommendations as to how one may index static html content for retrieving as a set of search results? Thanks for reading and for any feedback you have.
With Solr, you would write code that retrieves content to be indexed, parses out the target portions from the each item then sends it to Solr for indexing.
You would then interact with Solr for search, and have it return either the entire indexed document an ID or some other identifying information about the original indexed content, using that to display results to the user.