Search code examples
indexinglucenesitecorelucene.netsitecore7

Components not indexed in sitecore lucene search indexes


I have configured lucene search index in configuration & tested index with lukeall tool it searches for all fields of defined templates but content on pages are using another external component, which is not searched but data in fields of page are searchable. is there any way to search it something like html search so that all data on page could be indexed.

Thanks guys.


Solution

  • It's a common requirement.

    This screencast outlines an approach where the crawler loops through each of the page's components (at about 38 minutes in).

    http://www.techphoria414.com/Blog/2012/May/Sitecore_Page_Editor_Unleashed

    The above example uses the old Advanced Database Crawler, but the principle is sound.

    Another common approach is to create a computed field in your index which causes the application to request to the page, so it's HTML can be scraped.

    https://github.com/hermanussen/sitecore-html-crawler

    My preference is the second option because it's more accurate