I'm having trouble with Sitecore Indexing of the general indexes "sitecore_master_index", "sitecore_web_index", which take forever because the crawler/indexer checks all items in the database.
I imported thousands of products with a whole lot of specifications and literally have hundreds of thousands of items in the product repository.
If I could exclude the path from indexing it wouldn't have to check a million items for template exclusion.
FOLLOWUP
I implemented a custom-crawler that excludes a list of paths from being indexed:
<index id="sitecore_web_index" type="Sitecore.ContentSearch.SolrProvider.SwitchOnRebuildSolrSearchIndex, Sitecore.ContentSearch.SolrProvider">
<param desc="name">$(id)</param>
<param desc="core">sitecore_web_index</param>
<param desc="rebuildcore">sitecore_web_index_sec</param>
<param desc="propertyStore" ref="contentSearch/indexConfigurations/databasePropertyStore" param1="$(id)" />
<configuration ref="contentSearch/indexConfigurations/defaultSolrIndexConfiguration" />
<strategies hint="list:AddStrategy">
<strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/onPublishEndAsync" />
</strategies>
<locations hint="list:AddCrawler">
<crawler type="Sitecore.ContentSearch.Utilities.Crawler.ExcludePathsItemCrawler, Sitecore.ContentSearch.Utilities">
<Database>web</Database>
<Root>/sitecore</Root>
<ExcludeItemsList hint="list">
<ProductRepository>/sitecore/content/Product Repository</ProductRepository>
</ExcludeItemsList>
</crawler>
</locations>
</index>
In addition I activated SwitchOnSolrRebuildIndex as it's awesome ootb functionality, cheers SC.
using System.Collections.Generic;
using System.Linq;
using Sitecore.ContentSearch;
using Sitecore.Diagnostics;
namespace Sitecore.ContentSearch.Utilities.Crawler
{
public class ExcludePathsItemCrawler : SitecoreItemCrawler
{
private readonly List<string> excludeItemsList = new List<string>();
public List<string> ExcludeItemsList
{
get
{
return excludeItemsList;
}
}
protected override bool IsExcludedFromIndex(SitecoreIndexableItem indexable, bool checkLocation = false)
{
Assert.ArgumentNotNull(indexable, "item");
if (ExcludeItemsList.Any(path => indexable.AbsolutePath.StartsWith(path)))
{
return true;
}
return base.IsExcludedFromIndex(indexable, checkLocation);
}
}
}
You can override SitecoreItemCrawler
class which is used by the index you want to change:
<locations hint="list:AddCrawler">
<crawler type="Sitecore.ContentSearch.SitecoreItemCrawler, Sitecore.ContentSearch">
<Database>master</Database>
<Root>/sitecore</Root>
</crawler>
</locations>
You can then add your own parameters, e.g. ExcludeTree
or even a list of ExcludedBranches
.
And in the implementation of the class just override method
public override bool IsExcludedFromIndex(IIndexable indexable)
and check whether it is under excluded node.